Hi:
2S MAR 711
NAVAL R£S€flflCH
L
OUflRTfRLy
DECEMBER 1970
VOL. 17, NO. 4
OFFICE OF NAVAL RESEARCH
NAVSO P-1278
Hol-ft
NAVAL RESEARCH LOGISTICS QUARTERLY
EDITORS
H. E. Eccles
Rear Admiral, USN (Retired)
F. D. Rigby
Texas Technological College
O. Morgenstern
New York University
D. M. Gilford
U.S. Office of Education
S. M. Selig
Managing Editor
Office of Naval Research
Arlington, Va. 22217
ASSOCIATE EDITORS
R. Bellman, RAND Corporation
J. C. Busby, Jr., Captain, SC, USN (Retired)
W. W. Cooper, Carnegie Mellon University
J. G. Dean, Captain, SC, USN
G. Dyer, Vice Admiral, USN (Retired)
P. L. Folsom, Captain, USN (Retired)
M. A. Geisler, RAND Corporation
A. J. Hoffman, International Business
Machines Corporation
H. P. Jones, Commander, SC, USN (Retired)
S. Karlin, Stanford University
H. W. Kuhn, Princeton University
J. Laderman, Office of Naval Research
R. J. Lundcgard, Office of Naval Research
W. H. Marlow, The George Washington University
B. J. McDonald, Office of Naval Research
R. E. McShane, Vice Admiral, USN (Retired)
W. F. Millson, Captain, SC, USN
H. D. Moore, Captain, SC, USN (Retired)
M. I. Rosenberg, Captain, USN (Retired)
D. Rosenblatt, National Bureau of Standards
J. V. Rosapepe, Commander, SC, USN (Retired)
T. L. Saaty, University of Pennsylvania
E. K. Scofield, Captain, SC, USN (Retired)
M. W. Shelly, University of Kansas
J. R. Simpson, Office of Naval Research
J. S. Skoczylas, Colonel, USMC
S. R. Smith, Naval Research Laboratory
H. Solomon, The George Washington University
I. Stakgold, Northwestern University
E. D. Stanley, Jr., Rear Admiral, USN (Retired)
C. Stein, Jr., Captain, SC, USN (Retired)
R. M. Thrall, Rice University
T. C. Varley, Office of Naval Research
C. B. Tompkins, University of California
J. F. Tynan, Commander, SC, USN (Retired)
J. D. Wilkes, Department of Defense
OASD (ISA)
The Naval Research Logistics Quarterly is devoted to the dissemination of scientific information in logistics and
will publish research and expository papers, including those in certain areas of mathematics, statistics, and economics,
relevant to the over-all effort to improve the efficiency and effectiveness of logistics operations.
Information for Contributors is indicated on inside back cover.
The Naval Research Logistics Quarterly is published by the Office of Naval Research in the months of March, June,
September, and December and can be purchased from the Superintendent of Documents, U.S. Government Printing
Office, Washington, D.C. 20402. Subscription Price: $5.50 a year in the U.S. and Canada, $7.00 elsewhere. Cost of
individual issues may be obtained from the Superintendent of Documents.
The views and opinions expressed in this quarterly are those of the authors and not necessarily those of the Office
of Naval Research.
Issuance of this periodical approved in accordance with Department of the Navy Publications and Printing Regulations,
NAVEXOS P-35
Permission has been granted to use the copyrighted material appearing in this publication.
A GENERALIZED UPPER BOUNDING METHOD FOR DOUBLY
COUPLED LINEAR PROGRAMS
James K. Hartman
Naval Postgraduate School
Monterey, California
and
Leon S. Lasdon
Case Western Reserve University-
Cleveland, Ohio
1. INTRODUCTION
The constraints of large linear programs can often be partitioned into independent subsets, except
for relatively few coupling rows and coupling columns. The individual subsets may, for example, arise
from constraints on the activity levels of subdivisions of a large corporation. Alternatively, such blocks
may arise from activities in different time periods. The coupling rows may arise from limitations on
shared resources or from combining the outputs of subdivisions to meet overall demands. The coupling
columns arise from activities which involve different time periods (e.g., storage), or which involve dif-
ferent subdivisions (e.g., transportation or assembly). The case with only coupling rows or coupling
columns, but not both has received much attention [8], [9]. A smaller amount of work has been done
on the problem with both coupling rows and columns. Ritter has proposed a dual method [2], [7].
Except for the preliminary work of Webber and White [10] and Heesterman [4], there is no primal
algorithm which exploits the structure of this problem. There is a need for such an algorithm, since
such problems occur often in practice. A primal method is desirable since in large problems slow con-
vergence may force termination of the algorithm prior to optimality.
The algorithm proposed here is an extension of the generalized upper bounding method for prob-
lems without coupling columns proposed in [5] and [6]. It produces the same sequence of extreme
point solutions as the primal simplex method, and hence has the desirable convergence properties of
that algorithm. However, the operations within each simplex iteration are organized to take maximal
advantage of problem structure. Because of this structure it is possible to perform the computations
while maintaining a compact representation of the basis inverse. In particular it is sufficient to main-
tain and update at each cycle a working basis inverse, inverses from each block, and possibly another
matrix, V in (4). The dimension of the working basis need never be more than the number of coupling
rows in the problem plus the number of coupling columns in the current basis. Hence its dimension
may change from cycle to cycle. At most one of the block inverses need be updated at any interation.
Given these quantities, all information needed to carry out a simplex iteration can easily be obtained.
Further, efficient relations are given for updating the working basis and block inverses and V for the
next cycle.
A significant amount of computational work has been done on a special class of production and
inventory problems. Problems as large as 362 rows by 3225 columns were solved. Computation times
411
412
J. K. HARTMAN AND L. S. LASDON
are encouraging, although no comparisons with other methods are available. The results indicate that
the algorithm is sensitive to the dimension of the working basis and that effective measures can be
taken to minimize this dimension.
When there are only coupling rows, the algorithm simplifies considerably and reduces to the
procedure described in [5]. When each diagonal block contains only a single row, it reduces further
to the generalized upper bounding method of Dantzig and Van Slyke [1].
2. BASIS STRUCTURE
The problem considered here is
subject to
minimize .toi
B x + ^AiXi= b
Di.x + BiXj = bt i = 1 ,
where X{ is a vector with rn components, all of which must be nonnegative except for xoi, the first com-
ponent of x - The column corresponding toxoi is all zero except for a 1 in the first row. This first row of
the constraint matrix then defines the objective function. In matrix form the constraints can be
represented:
Xi
X-l
Bo
A,
A 2
A
ft
A>
—~^-~'
B 2
Xp — l
Xp
A p -\
A P
No. of
rows
= b
m
= h
mi
= b 2
m 2
(1)
Dp-x
D D
4>
Jo Jl -J2
No. of columns
n rii n-i
Bp-i
"— '
B P
— bp-i
TTlp-i
= b P
rn p
total =
M rows
On-l Sr
rip-i rip total= N columns
The problem has p diagonal blocks of dimension m; X m and an L-shaped border consisting of m
coupling constraints and n coupling variables. The set Si denotes either the variables in the vector
Xi or the columns of (1) associated with these variables. The total constraint matrix has dimension
MXN. We assume throughout that this matrix has rank M so that any basis matrix for the system will
contain M columns and will be nonsingular.
DOUBLY COUPLED LINEAR PROGRAMS
413
Consider the structure of any basis, B, for the system of constraints (1). Arranging the columns of
B in the same order as in (1), yields the basis matrix in Figure 1 where the shaded areas contain nonzero
entries. The column for .r i is always in the basis and will always be the first basic column.
t o + l es, eS 2 eS 4 eS 5
COUPLING
COLUMNS
es„
es 6 €S 7
Figure 1
In this example the problem has seven blocks. There are no columns from block 3 in this particular
basis. The basis matrix consists of q rectangular blocks, q =£ p, roughly along the diagonal with borders
from the coupling rows and coupling columns. The rectangular blocks may be either "tall," "square,"
or wide.
The algorithm to be developed depends on having the lower right hand partition of B consist of
square nonsingular blocks along the diagonal. Our strategy for handling the nonsquare blocks will be to
rearrange the rows and columns of the basis matrix as follows:
a) For blocks with a column excess, move the extra columns over beside the coupling columns
leaving a square block.
b) For blocks with a row excess, move the extra rows up with the coupling constraint rows leaving
a square block.
c) Assure that the resulting square diagonal blocks are nonsingular.
Performing these operations on the basis B in Figure 1 will give the matrix in Figure 2, where we have
m COUPLING
ROWS
[excess ROWS
r FROM BLOCKS
3, 4, AND 5
S ROWS IN
BLOCK
DIAGONAL
SECTION
l + I EXCESS S COLUMNS IN
COUPLING COLUMNS BLOCK DIAGONAL
COLUMNS FROM S|,S 5 , SECTION
€S„ AND
Figure 2
414
.1. K. HARTMAN AND L. S. LASDON
labeled the resulting nine submatriees for reference in Theorem 1. The submatrix y 3 contains square
nonsingular blocks and hence it is nonsingular. For this example block 5 was still singular after it was
made square, so extra row-column pairs were removed from it until the remaining block became non-
singular. This introduced more excess rows and columns and created nonzero elements in the sub-
matrix fa.
For computational purposes the block diagonal section, y 3 should be as large as possible, but still
nonsingular. A lower bound on its size is now derived,
THEOREM 1: Let S be the dimension of the block diagonal section 73 of Figure 2, and M the
dimension of the entire basis matrix. Suppose that the partitioning has been done so all diagonal
blocks are nonsingular, and that there is no alternate partitioning which would give larger nonsingular
blocks. Then S 3= M — 1 — m .
PROOF: Let any basis matrix for (1) be partitioned as illustrated in Figure 2;
B
Oil
ft
y\~
a 2
ft
y-i
«3
0a
73 _
The block diagonal submatrix y 3 is nonsingular, and hence has rank S. Consider the submatrix [fa, 73] ■
It also has rank S since it has S linearly independent columns (those of y 3 ) and only S rows. Suppose
there is a row [/3, 7] of the submatrix [/3 2 , 72] which is not a linear combination of the rows of [/3 3 , 73]
and, say, this row is an excess row from block). By the special structure of this row (7 is zero except
"J3 Z
in block j) there must be at least one excess column in
L/83J
from the same block j, (otherwise the row
[/3, 7] could not be independent). Since the row [/3, 7] is independent of the rows in the 7th block of
[fa, 73] , this row and one of the excess columns can be adjoined to the,/th block to give a larger square
nonsingular block than before. But this contradicts the hypothesis of the theorem. Hence every row
of [j3>, y 2 ] is a linear combination of rows of [/3 3 , 73]- This proves that the submatrix
K =
fa 72
./3s 73
has rank S. Adjoin the first / +l columns to K giving the submatrix L —
L has rank
a 2 fa y>
. a 3 fa 73 J
^S + l since only / +l columns were added and the first of these is all zero in these rows. Finally
adjoin the first m rows to L giving the entire basis matrix
B
«i /3i 71
«2 02 72
_«3 fa 73
Then B has rank =s S + l + m since only m rows have been added to L and hence there can be at most
m more independent rows in B than in L. But rank B = M, so that M «s S + l + m , orS 3= M — l — m
completing the proof of Theorem 1.
DOUBLY COUPLED LINEAR PROGRAMS
415
3. THE ALGORITHM
Our approach to the solution of doubly coupled problems will be to apply the revised primal
simplex method to the problem. The special structure of the basis matrix will be exploited to simplify
the computational and storage problems which occur for large problems. The revised simplex method
involves the following steps at each iteration:
a) Find the simplex multipliers 7r = c/ ; B~ 1
b) Price out the nonbasic columns Pj,
Cj=c j -irP j
and choose a column P s with negative c s to enter the basis if the current solution is not optimal.
c) Transform the entering column in terms of the current basis,
A = B"'P S .
d) Determine the column to leave the basis,
. X Bi X Hr
a is Clrs
where a,. s is component i of P s .
e) Pivot to update B" 1 and the current solution to account for the basis change.
Steps a) and c) involve multiplication by the basis inverse matrix B" 1 . To maintain B~' for large
problems requires extensive storage and computation, since even though B has special structure, B '
is essentially dense with nonzero elements. Instead of performing the computations in a) and c) directly,
we will solve the equations
(2)
and
(3)
7rB = C/j (solve for it)
BP s = P s (so\veforP s )
by making a transformation of the basis matrix B— »R where R is block triangular. The resulting
equations in terms of R will make it possible to exploit the special structure of B.
Consider a nonsingular matrix T defined so that BT= R is upper block triangular,
B
(4)
//
/>',
/
V
I
R
B
H
B x
where B\ is the block diagonal section y 3 , G, H, and J are the remaining partitions of B
e.g. G
on fix
and
416 J- K. HARTMAN AND L. S. LASDON
(5) V=-B V i J.
The matrix B is central to the procedure and will be called the working basis. From (4) and (5) we see
that
(6) B = G + HV=G-HB^J.
By Theorem 1, B can always be partitioned so that the dimension of B is at most m + U.
THEOREM 2: The working basis B is nonsingular.
PROOF: R is nonsingular since both B and T are. Since all elements of R below B are zero,
B must be nonsingular.
To find the simplex multipliers, consider Eq. (2). Multiply through by T giving
(7) 7rR = 7rBT=c B T=(l,0, . . ., 0)T=(1,0, . . ., 0).
The last equality holds since only .toi appears in the objective function, and by convention .toi is always
the first basic variable. Since T is nonsingular, this multiplication will not change the solution tt.
Now partition tt as tt = (tt , ttu . . ., 7r 9 ), where
TT has MS components and is the multiplier for rows in the working basis B.
ttj has Sj components and is the multiplier for rows in theyth diagonal block Bj\ which is Sj X Sj and
nonsingular, j= 1, . . . , q. From the structure of R in (4), Eq. (7) becomes
(8) 7r o 5=(l,0, . . ., 0) (MS components), and
(9) tt Hj + ttjBji = (sj components) 7= 1, . . ., q,
where //, is the submatrix of H containing the Sj columns in they'th block. Solving (8) gives
(10) 7T o =(l,0, . . ., 0)fi" 1 = first row of B- 1 ,
which can be substituted into (9) to give
.(11) ir^-iroHjByS /=1, - - . ., q.
Hence to obtain the simplex multipliers it suffices to know the inverse of the working basis and the
inverse of each of the diagonal blocks.
Next consider Eq. (3) for transforming the entering column in terms of the current basis. Define a
nonsingular change of variables by
(12) z=T~iP s .
Then P s = Tz, so Eq. (3) becomes
(13) Rz=BTz=BP s = / > s ,
which is a block triangular system and hence easier to solve than (3). Partition z and P s as (z , zi, . . . .,
z q ) and (P S o, Psi, . . . ., P S q) , respectively. Then (13) can be written as
DOUBLY COUPLED LINEAR PROGRAMS 417
(14) Bz +^HjZj=P s
(15) Bj lZj = P sj ;=1, . .
Solving (15) gives
(16) Zj = Bj^P sj ,
which when substituted into (14) gives
(17) z =B
Ap so -±H } z\.
j=i J
If P s is not a coupling column, these formulas simplify since than P SJ = for all but one j( = 1, . . . .,
q). Hence by (16)z/ = for all but one /' and the summation in (17) has only one term.
It is now a simple matter to obtain P s from z using P s = Tz. With P s partitioned as (P s o, Psi, ■ ■ •,
P S q) the structure of T in (4) gives
(18) Pso = lz = z , and
( 19) P*) = Vjzo + Izj =VjZ + Zj j= 1, . . . , q,
where Vj is the submatrix of V containing the rows in the 7th block. This completes the transformation
of the column entering the basis.
4. CHANGING THE BASIS
Pricing out columns to select the entering column and choosing the column to leave the basis are
done as in the ordinary revised simplex method, so it only remains to describe the updating procedures.
Since the entire basis inverse B" 1 is not needed, the updating requirements will be somewhat different
from those of the ordinary revised simplex algorithm. In particular, reviewing the solution of Eqs. (2)
and (3) shows that it is sufficient to update B ', fij, 1 (7= 1, . . ., q), and V at each iteration.
Before describing the various cases which can occur, we derive a general result for updating the
working basis inverse.
THEOREM 3: If a basis change can be described by *B _1 = EB~ 1 , where *B _1 is the basis in-
verse after the change, and E is any transformation matrix, then the working basis inverse can be
updated by
(20) *B-< = (E l +E 2 V)B-\
where E x and E> are partitions of E to be described in the proof.
PROOF: From the definition and nonsingularity of R and T, in (4), (5)
(21) *R-' = *T '*B ' = *T 'EB =*T ETR 1 .
Here all *'ed symbols relate to the system after the basis change is made. Writing (21) in partitioned
form and using partitioned inverse theorems gives
418
J. K. HARTMAN AND L. S. LASDON
(22)
*g -1
_*g-l*//*fl r l
*#r*
/
_ *y
I
£. Et
Ez Ea
I
V
1
B '
-b 'f/Br 1
fir 1
To update the working basis inverse, note that it appears in the upper left hand partition of *R _1 . Hence
deleting all but that submatrix gives the following specialization of (22):
*B
I
£,
E-,
E,
E 4
I
V
I
B
(23)
£, E,
B
VB
= (E x + E-iV)B-
Thus E\ + E>V is a transformation matrix for the working basis inverse.
Let the columns from the block diagonal section of B be called "key" columns. The remaining
columns of B are called "non-key." There are several cases to consider in the updating procedure:
CASE 1: The column leaving the basis is non-key. Then the entering column can be brought into
the basis as a non-key column, and the standard updating formula for B _1 is *B _1 = EB _1 . Here E
is an elementary matrix, equal to the identity except in the rth column which is given by [171, 7/2, • • .,
t/a/]', where
f)i
(24)
a is
., M
Vr :
a rs
(recall that the rth basic column is leaving the basis). Since the leaving column is non-key, partitioning
E as in Theorem 3 gives E> = 0, and E\ is an M — SX M — S elementary matrix equal to the identity
except in the rth column which is (t/i, . . ., t)m-s)'-
To update the working basis inverse for Case 1, substitute these into (20):
(25)
*B-*={Ei + EiV)B l ={E x + OV)B l = E { B
Hence B~ x is updated by a single pivot operation.
None of the key columns in B change in Case 1, and hence none of the diagonal blocks Bj\ in B\
will change. Thus the block inverses, Bj] 1 , require no updating. To update ^recall that *V=— *Br l *J.
Now *Br } — Bf 1 as shown above, and *J is different from ,/ only in the rth column which is replaced
by the last S components of the entering column {P s \, ■ • -, Psq)' ■ Thus only the rth column of V will
change, and this will be replaced by
DOUBLY COUPLED LINEAR PROGRAMS
419
(26)
■flrH/V
., /%)' = -(;,.
■7 Zq) j
which has already been calculated in transforming the entering column. This completes the updating
calculations for Case 1.
CASE 2: The column leaving the basis is a key column (from the block diagonal section).
CASE 2a: Both the leaving column and the entering column are from the same block Bj\. Then
if the entering column will leave Bj\ nonsingular, a direct pivot can be performed and the basis change
can be described by *B -1 = EB"\ where E is an elementary column matrix. The E matrix differs
from that in Case 1 only because the (tji, . . ., t/m)' column is now in the key section of the matrix.
Thus E\ — Im-sxm-s, and Ei has only one nonzero column (t/i, . . ., tjm-s)', which is the r— (MS) th.
To update the working basis inverse apply formula (20):
B->=(E,+E-,V)B-
(27)
= /+
= /+
i7i
•
V)M-S
V\B
T7l
T)M-S_\
fl-' +
[v] \B
vB
where v is the r — (MS) th row of V.
In the block diagonal submatrix B u only the r — (MS)th column will change. Hence only one diagonal
block inverse, say Bj x \ will change. Bj^ will be updated by a single pivot, or equivalently,
(28)
*BjS = EBjS
where E is an elementary column matrix formed from the elements of the new column. This change
will leave Bj\ nonsingular if and only if the pivot element in this new column is nonzero. This condition
must be checked to see if Case 2a can be applied.
The J submatrix will not change, and only the 7th block of fir 1 changes, so in V= — Br^J only the
y'th partition will change. The updated version is given by
(29)
%=- (*BjS)J } =-EB T Mj=EV } ,
so the same elementary matrix is used to update V.
CASE 2b: The column leaving the basis is a key column from block Bj\. Case 2a cannot be used
either because the entering column is not from the same block (it may be a coupling column) or because
the nonsingularity test in Case 2a failed. Then we can avoid making Bj\ smaller only if there is an
excess column (see Figure 2) from block j which can be exchanged with the leaving column. If there is
such a column, say the /rth-basic column, then after the exchange *B = BE, where E is a permutation
matrix, an identity matrix with the rth and Ath columns exchanged. Since E _, = E, *B' = EB '.
420
J. K. HARTMAN AND L. S. LASDON
Then (20) yields *B~ 1 = (Ei + E 2 V)B- 1 , where E x is an M — S X M — S identity except for a zero in the
Ath diagonal position, and Ei is zero except for a one in the Ath row and r— (M — S)th column.
Hence
(30)
*B- 1 =(E l +E>V)B l
row k,
where v is the r — (MS)th row of V. This is a valid interchange if and only if E\ + E->V is nonsingular,
which is true if and only if the Ath element of the row v is nonzero. If all elements of v in the excess
columns are zero, then no interchange is possible and we proceed to Case 2c.
If an exchange occurs, one column of the block Bj\ will change. Hence the inverse can be updated
by a simple pivot
(31)
*fljl> = ££/,',
where E is an elementary matrix. The elements needed to form the eta column in E are found in the
Ath column of Vj.
To update V, note that by definition *V= — *Br i *J. Only the 7th block of Br 1 changes, and a
single column of J changes, but this column is zero outside the y'th block Jj. Hence only the submatrix
Vj needs to be updated,
(32)
*Vj = -*BrS*Jj = -EB Tl i *J j .
Now *Jj = Jj except in the Ath column which is replaced by the leaving column which we will suppose
to be the /th column in Bj\. Thus Bjx l *Jj — —Vj except in the Ath column which becomes the /th unit
vector, so that
(33)
*Vj = EVj
except in the Ath column which is the negative of the eta column of E.
This completes the updating required for an exchange of key and non-key basic columns. After
the exchange the leaving column is non-key, so Case 1 can be used to bring the new column into the
basis. It should be noted that Case 1 uses the elements a, s of the transformed entering column (see
(24)) and the vector (z\, . . ., z q ) (see (26)). The interchange will affect these quantities. To update
them interchange a rs and a^s, and replace Zj by Ezj.
CASE 2c: The column leaving the basis is a key column from block Bj\, and neither Case 2a nor
2b apply. When the column leaves, the new *Bj\ will have one less column, and hence to remain square
and nonsingular it must also lose a row. The process is most easily described by a repartitioning step
followed by an application of Case 1. In the repartitioning step, the leaving column is shifted to the
excess column partition, and some row of Bj\ is made an excess row.
Without loss of generality assume that the pair being shifted is the first column and row of fin.
DOUBLY COUPLED LINEAR PROGRAMS
421
The basis matrix before and after the change are given below. All elements of the matrices are identical,
only the partitioning changes.
(34)
B =
G
H
J
Bn
Br
(35)
B
This is the original partitioning. The row and
column indicated by dashed lines will be added
to the non-key section resulting in the new parti-
tioning scheme given below.
*G
*H
*J
*B U
*e,
In this new partitioning of the basis, *G has be-
come larger by one row and one column, *B U has
become smaller, and the other partitions change
in corresponding ways.
Now *B = B, but because of the changes in partition sizes, *T ¥= T and *R/R. Hence we must
compute *Bu\ *B~\ and *V.
*B{x is computed via standard formulas. If
(36)
Bn 1
w
X
Y
z
where Z contains all but the first row and column, then
(37)
B\-? = Z-
YX
w'
This is possible only if W ¥" which provides a criterion for choosing the row to be shifted. t
To obtain *fi -1 recall that B= *B so
(38)
*» - 1 = * r p- i*r- i — * r r- in - 1 = * r p- it 1 !! - 1
tin general, if the £th column of B ix is being shifted, then the /th row can be shifted if the /th element of row k\nBj x l is non-
zero. There must always be at least one nonzero element in row k since Bj^ is nonsingular.
422
J. K. HARTMAN AND L. S. LASDON
In partitioned form this is
(39)
*g-l
-*B l *H*Br l
*g_,
/
— *y
I
I
V
I
B-*
-fl- '//£,'
Br 1
To isolate *B~* delete all but the upper left hand corner of this expression giving:
(40)
£-• = /
I
V
I
B*
h
w
Y
where h is the first column of — B ] HB^ ' and [W ', Y]' is the first column of B,, 1
/
V
1
B ->
h
w
Y
where v is the first row of V.
The vector h is computed as
B- 1
h
vB l
vh+w
(41) h= first column of B x HBx '
= B~ l H (first column of 5f')
= S 1 f/(r|F|0)'
=fi- i //,(r|F)',
where //i is the first partition of H.
We note that the repartitioning here changes all partitions of B. Hence in the expression B = G — HB\ X J
for the working basis, all four factors change. Nevertheless, we need only add a "border" to B~ l to
get *B~K Starting from the equation *T~ 1 = *R" 1 RT" 1 and arguing as above, a formula for updating
V is obtained
DOUBLY COUPLED LINEAR PROGRAMS
423
(42)
\
Vx w
V
Y_
w
where v is the first row of V, V\ is the rest of the first block of V, and V is all other blocks of V. Note
that *V has one less row and one more column than V and that the only nontrivial change occurs in
the first block.
Performing these operations repartitions the basis matrix so that the leaving column becomes
non4cey. Applying Case 1 will then complete the basis change for Case 2c.
CASE 3: Using the above cases the simplex method can be performed, but the working basis
will increase in dimension by one each time Case 2c occurs. Hence it may be desirable to periodically
repartition the basis by moving excess rows and columns into the block diagonal section, essentially
the reverse of Case 2c.
Suppose we have an excess row and an excess column from block j. Consider the following
submatrices
a
B
excess row
(43)
Bji
jth diagonal block in B\.
\
excess column
Bringing the row and column into the key section would mean adding them to Bji resulting in a larger
jth diagonal block.
(44)
*B jt
a
/3
y
Bn
We already know fi,-, 1 and we want to compute the inverse of the new block. If this inverse exists sup-
pose that it is partitioned as
(45)
Then [3] the blocks are given by
(46)
and the inverse exists if and only if
*Bj?
w
X
Y
Z
W=ll(a-pBy l iy)
X = -WBBtf
Y^-BtfyW
Z = B^-B^yX,
(47)
a-QBrfy^O.
424
J. K. HARTMAN AND L. S. LASDON
For computational purposes, note that — Bj^y is just the jth partition of the column in V corresponding
to the column being moved. Hence to test for possible pairs to be moved requires computation of an
inner product of /3 with a column of Vj.
For convenience in expression, and without loss of generality, we again assume that the last row
and column of the non-key section will become the first row and column of the key section. The ele-
ments of *B will be exactly the same as the elements of B — only the partitions will change. The new
working basis *B will become smaller. To find an expression for its inverse consider the expression
(48)
"R
i — *»-
B
i __ * r
B
— *T-\
T 'TR
Writing this in partitioned form will give the following matrix equation:
(49)
*B 1
_*£-,*#*£-,
*£_,
/
_ *y
I
I
V
I
B- x
-fl-'flflr 1
fir 1
Taking only the first rows and columns of this equation gives:
(50)
*R-1
B
I
I
V
I
B
B
where B is just B~* with the last row and column deleted. Thus to get the new working basis inverse,
merely delete the last row and the last column in the current working basis inverse.
To calculate *V use the equation
(51)
*T-' = *R-i*B=*R- I B=*R- I RT-'.
Then writing this in partitioned form and proceeding as above yields the result
(52)
-W±
Vx-Y^
where V\ is the first partition of V less the last column, and V is the remaining partitions of V less the
last column. For this to be computed we must find \\> which is the bottom row of the working basis less
its last component £. The working basis is not carried along explicitly, so we must compute (//.
B = G + HV from (6),
DOUBLY COUPLED LINEAR PROGRAMS 425
where G and H are partitions of the basis matrix B and V is updated at each iteration of the process.
(53) [</», £] = last row of B
= last row of G + (last row of H) ■ V
= g+h • V (g = last row of G
h = last row of H)
= g+[B,0]-V
=g+(3-Vi,
and dropping the last element of this gives (//.
In the above calculations the most complicated formulas were those for updating V in the two
repartitioning procedures, Eqs. (42) and (52). An alternative procedure in these cases is to compute
Ffrom its definition V= — B^ i J. In each case, only one partition of ^changes, say partition Vj, given by
(54) *Vj=-*B Tl i *j j ,
with Bj\ of dimension Sj. To compute this requires sj multiplications for each nonzero column of *Jj,
a total of at most (M-S)Xs'j multiplications. The updating calculations require on the order of
(M — S)~Xsj multiplications. Hence they are computationally superior, but require more extensive
program logic.
After performing the appropriate updating procedure, we are ready to start the next simplex
iteration. Thus the description of the basic algorithm is complete.
5. SPECIALIZATION TO PROBLEMS WITH ONLY COUPLING ROWS
For problems which have no coupling columns, the algorithm simplifies considerably. The major
simplification is that no repartitioning will ever be necessary and there will never by any excess rows.
Hence Cases 2c and 3 of the updating procedure, which are the most complicated, are never needed.
The working basis B in (6) will always have exactly as many rows as there are coupling constraints,
and each of the blocks Bji has dimension nij (see (1)). Relations (10), (11) for the simplex multipliers
remain the same as do relations (16)— (19) for the transformed entering column. In updating, only Cases
1, 2a, and 2b can occur. These remain essentially the same. The computations involving V simplify
because each column of V except the first has only one nonzero partition. This specialization of the
algorithm is very similar to the method proposed by Kaul [5] and is described by Lasdon in [6].
6. APPLICATIONS TO PRODUCTION AND INVENTORY PROBLEMS
Consider a corporation which wishes to schedule the production of K products at L plants for T
time periods into the future. At each plant, and for each period the demand for each product is con-
sidered known and must be met in that period. The operation of the /th plant in the fth time period is
hence limited by these K demand constraints and also by r constraints on locally available resources
(e.g., plant capacity, labor) giving rise to a constraint block with K + r rows. There are LTsuch diagonal
blocks. These blocks are coupled by constraints on scarce corporate resources which are allocated
across the various plants and budgeted over time (e.g., corporate capital, scarce raw materials, highly
426 J - K - HARTMAN AND L. S. LASDON
skilled labor). In addition to producing for immediate demand, any plant may
a) Produce a product and place it in inventory for future use.
b) Produce a product and ship it to some other plant which has a shortage of that product.
Each of the inventory and transportation activities gives rise to a column which couples two of
the diagonal blocks. Thus we have a doubly coupled linear program to solve. The number of rows is
on the order of (K+ r)LT, so truly large problems may result. There are KLT(2L + T—3)/2 coupling
columns arising from the inventory and transportation activities, but these have very special structure.
Each column has only a cost coefficient, a single + 1 in'the A;th demand equation for one block, and a
single — 1 in the kth demand equation for another block. Hence they may be stored implicitly and
priced out with minimal effort. Since these coupling activities incur a cost in addition to the production
cost, we anticipate that even though there are many such activities, relatively few will be profitable.
Hence in any basis there should be relatively few coupling columns so that the algorithm described
can be applied.
A number of computational simplifications appear. In computing the transformed entering
column, at most two of the Zj in (16) will be nonzero, so (17) simplifies considerably. The special form
of the coupling columns implies that in J any coupling column has only 2 nonzero elements + 1 and
— 1, while an excess column has only one partition nonzero. Hence in V= — Bi*J a coupling column
has at most two nonzero partitions and these are columns of the block inverses Bj\. An excess column
has one nonzero partition in V. Hence it is probably best to compute V at each iteration instead of
updating it. This is desirable since the most complicated update formulas are those involving V.
7. COMPUTATIONAL RESULTS
The algorithm described above has been coded and used to solve a number of test problems of
the type described in section 6. The program was written in FORTRAN V for the Univac 1108 com-
puter at Case Western Reserve University. The special structure of these problems made it possible
to solve reasonably large programs all in core. All problems were solved in single precision arithmetic.
Whenever a block inverse or the working basis inverse had been updated 50 times, it was re-inverted
using a standard Caussian elimination routine. Good numerical accuracy was obtained in that different
runs on the same problem yielded solutions which were the same to seven significant figures. The
code was not written to be competitive with commercial routines, but rather to investigate the effects
of various pricing and repartitioning strategies. Nevertheless the solution times recorded are encouraging.
Data describing the test problems is given in Table 1. The notation used is as in section 6. For each
problem size, two problems were formulated. The lower numbered problem of each pair was constructed
to be relatively easy, in that few coupling columns appear in an optimal basis. The second problem in
each pair is derived from the first by adjusting the right hand side demand and resource availability
vector so that more coupling activities are required. Hence it is more difficult. For all problems, phase
I was initiated with a basis consisting of slack variables for all resource constraints and artificial vari-
ables for demand constraints.
Table 2 describes the effect of three different pricing strategies. Pricing strategy one allowed
coupling columns to enter the basis at any iteration. It led to long running times and large working
bases since many coupling columns tended to enter the basis in phase I, even in problems for which
there were none in the optimal basis. Pricing strategy two did not allow coupling columns to enter the
basis in the first M iterations unless all other columns priced out optimally. It produced a substantial
reduction in running time. Strategy three, which was the most successful, allowed no coupling columns
DOUBLY COUPLED LINEAR PROGRAMS
427
Table 1. Test Problem Descriptions
Problem
numbers
Number of
products
(K)
Number of
plants
(L)
Number of
time periods
(T)
Number of
blocks
(LT)
Block
size
Number of
coupling
columns
Number of
coupling
rows
Total
problem size
1,2
5
3
4
12
7x22
210
6
90 x 474
3,4
5
3
6
18
7x22
405
8
134 x 801
5.6
5
3
8
24
7x22
660
10
178x1188
7,8
5
3
10
30
7x22
975
12
222 x 1635
9, 10
5
5
6
30
7x22
975
8
218 x 1635
11, 12
5
5
8
40
7x22
1500
10
290 x 2380
13, 14
5
5
10
50
7x22
2125
12
362 x 3225
Table 2. Effect of Pricing Strategies
Problem
number
Pricing
strategy
Total
iterations
Total
time
(sec)
Iterations
(per sec)
Maximum
size of
working
basis
Maximum
number of
coupling
columns
in basis
1
1
329
18.52
17.8
31
28
3
178
5.05
35.2
9
3
3
1
440
36.88
11.9
37
34
2
326
12.99
25.1
22
15
3
274
9.39
29.2
10
4
5
1
Exceeded 60 second time limit
2
451 25.33 17.8 25
20
3
321 14.43 22.2 15
6
7
1
Not attempted
2
557 38.07 14.6 27
21
3
438 24.12 18.2 21
11
(All runs made with case 3 repartitioning attempted every 10 iterations)
428
J. K. HARTMAN AND L. S. LASDON
Table 3. Effects of Repartitioning
Strategies
Number of occurrences of various
Final
Final
Maxi-
Maxi-
Repar-
Total
Total
cases in updating.
size
No.
mum
mum
Problem
tition-
time
Iterations
number
ing
strat-
egy
tions
(sec)
(per sec)
1
2a
2b
2c
3
ing
basis
pling
col-
umns
work-
ing
basis
number
coupling
columns
1
2
3
4
5
6
7
8
9
10
11
12
13
14
1
1
178
4.95
36.0
82
96
25
5
11
11
3
2
178
5.05
35.2
81
97
26
7
17
6
9
3
3
178
4.60
38.7
81
97
26
7
35
6
9
3
6
178
4.62
38.5
81
97
25
9
53
6
9
3
2
1
299
7.71
38.8
15s
144
45
9
15
5
15
7
2
311
7.66
40.6
163
148
50
17
31
10
5
11
8
3
312
7.66
40.7
162
150
48
19
62
9
5
11
8
6
311
7.72
40.3
162
149
49
20
64
10
5
12
8
5
1
322
16.17
19.9
113
209
38
9
19
19
6
2
321
14.43
22.2
108
213
38
9
31
10
15
6
3
321
13.49
23.8
108
213
38
9
64
10
15
6
6
321
13.46
23.8
108
213
38
9
174
10
15
6
6
1
711
45.97
15.5
361
350
84
26
36
18
36
23
2
733
39.81
18.4
379
354
82
63
73
22
16
25
20
3
706
38.16
18.5
349
357
76
59
141
24
18
26
22
6
675
36.33
18.6
322
353
69
78
426
23
17
28
23
9
1
421
24.99
16.8
192
229
62
13
21
4
21
12
2
416
21.96
18.9
188
228
55
34
41
12
6
17
13
3
416
21.66
19.2
188
228
55
34
83
10
6
17
13
6
416
21.38
19.5
188
228
55
32
132
10
6
18
13
10
1
1023
62.14
16.5
638
385
200
25
33
15
33
26
2
1034
54.13
19.1
656
378
175
77
103
21
15
29
26
3
1025
54.78
18.7
646
379
177
77
205
21
15
26
23
6
1023
53.01
19.3
650
373
171
85
705
21
15
27
22
11
1
570
46.35
12.3
277
293
84
15
25
5
25
14
2
565
38.43
14.7
263
302
86
39
56
12
5
21
16
3
565
38.42
14.7
263
302
86
39
113
12
5
21
16
6
577
38.78
14.9
285
292
86
40
244
11
5
21
16
12
1
1194
92.68
12.9
748
446
233
28
38
17
38
23
2
1209
84.14
14.4
764
445
217
76
121
23
16
30
26
3
1208
89.49
13.5
766
442
225
81
241
25
17
33
27
6
1231
87.02
14.1
784
447
220
98
900
25
18
33
26
13
1
754
96.01
7.9
390
364
108
26
38
7
38
19
2
747
69.41
10.8
372
375
102
55 .
74
14
9
26
19
3
751
68.44
11.0
377
374
99
61
150
14
9
26
19
6
769
73.02
10.5
397
372
104
72
372
17
9
30
22
14
1
1364
137.38
9.9
787
577
258
81
43
20
43
26
2
1390
123.29
11.3
790
600
238
79
139
28
20
33
26
3
1389
118.84
11.7
807
582
247
82
278
27
20
34
28
4
1351
119.74
11.3
750
601
237
89
1351
29
21
35
27
5
1379
126.08
10.9
789
590
248
79
414
30
21
37
27
6
1351
113.60
11.9
750
601
237
89
1066
29
21
35
27
DOUBLY COUPLED LINEAR PROGRAMS 429
to enter the basis (unless necessary) until all artificial variables have left the basis. Pricing strategy
three was used for all of the remaining runs. All problems in Table 2 were solved with the repartitioning
procedure of Case 3 attempted every 10 iterations (see section 4).
Table 3 shows the effects of various repartitioning strategies, i.e. strategies for employing Case 3.
The strategies tested are to attempt Case 3:
1. never
2. every 10 iterations
3. every five iterations
4. every iteration
5. whenever there are at least 10 excess columns
6. whenever there are at least five excess columns.
The different strategies often gave rise to slightly different numbers of iterations, probably because
different strategies result in different orderings of the rows in the problem. Hence if the basis is de-
generate, ties may be broken in different ways, and different columns will leave the basis.
Using the strategy of never employing Case 3 involves a tradeoff. The computations of Case 3
never have to be performed, and Case 2c is performed less often. This is because excess columns
accumulate in the non-key section so the interchange of Case 2b is more likely to succeed. However
if Case 3 is not performed, then the dimension of the working basis increases whenever Case 2c is
performed, and never decreases. The overall result is that the number of iterations per second is lowest
among all strategies tested. Hence it is desirable to repartition the working basis periodically. Among
the other strategies tested, no consistent differences emerged.
REFERENCES
[1] Dantzig, G. B. and R. M. Van Slyke, "Generalized Upper Bounded Techniques for Linear Pro-
gramming," Journal of Computer and System Sciences 1, 213-226 (1967).
[2] Grigoriadis, M. D. and K. Ritter, "A Decomposition Method for Structured Linear and Nonlinear
Programs," Journal of Computer and System Sciences, 3, 335-360 (1969).
[3] Hadley, G., Linear Algebra (Addison-Wesley, Reading, Mass., 1961).
[4] Heesterman, A. R. G., "Special Simplex Algorithm for Multisector Problems," Numerische Mathe-
matik 12,288-306(1968).
[5] Kaul, R. N., "An Extension of Generalized Upper Bounded Techniques for Linear Programs,"
ORC-65-27, Operations Research Center, University of California, Berkeley (1965).
[6] Lasdon, L. S., Optimization Theory for Large Systems (Macmillam Company, New York, 1970).
[7] Ritter, K., "A Decomposition Method for Linear Programming Problems with Coupling Constraints
and Variables," MRC No. 739, Mathematics Research Center, University of Wisconsin (1967).
[8] Rosen, J. B., "Primal Partition Programming for Block Diagonal Matrices," Numerische Mathe-
matik 6, 250-260 (1964).
[9] Sakarovitch, M. and R. Saigal, "An Extension of Generalized Upper Bounding Techniques for
Structured Linear Programs," SIAM Journal on Applied Mathematics 15, 906-914 (1967).
[10] Webber, D. W. and W. W. White, "An Algorithm for Solving Large Structured Linear Program-
ming Problems," IBM New York Scientific Center Report No. 320-2946 (1968).
NONLINEAR PROGRAMMING
THE CHOICE OF DIRECTION BY GRADIENT PROJECTION'
Philip B. Zwart
Washington University
St. Louis, Missouri
ABSTRACT
Rosen's method of Gradient Projection chooses a search direction which is not neces-
sarily the direction of steepest ascent. However, the projection of the gradient onto a "suit-
ably chosen subspace" does yield the direction of steepest ascent. The suitable choice is
easily recognized as a result of some theorems relating gradient projection to steepest ascent.
These results lead to a modification of Rosen's method. The modification improves the choice
of search direction and usually yields the steepest ascent direction without solving a quadratic
programming problem.
I. INTRODUCTION
The general continuous nonlinear programming problem can be stated as:
Find x*=(x*, . . ., x*) which yields the maximum value of F(x) among all x which satisfy
<f>i(x) 2*0, i=l, . ■ ., A-., i.e.,
(1)
MaxF(x).,
subject to: </>,(*) ** 0, i = 1, . . ., k.
The F and $i's are assumed to be concave and twice continuously differentiable. Concavity is
usually required for theoretical discussions of convergence. Numerical approaches to solving (1) can
often be successfully applied even if concavity is lacking. This is true for the material discussed herein.
Gradient Projection, first discussed by J. B. Rosen [2], [3], is a numerical iterative procedure for
solving (1), (or attempting to solve (1)) by a sequence of one dimensional searches made in the direction
of the "projected gradient."
This paper indicates a weakness in Rosen's method, gives some mathematical results relating
gradient projection to steepest ascent, and describes a small modification, which follows naturally
from the mathematical results and overcomes the indicated weakness.
Section II is devoted to notation used throughout the paper. Section III gives a rough description
of Rosen's method and a detailed description of those aspects with which this paper is concerned.
Section IV shows that all "active" constraints should not necessarily be used in projecting the gradient.
Section V reviews some linear algebra which may make the mathematical proofs more lucid. Section
VI presents two theorems which relate the "suitably projected" gradient direction to the direction of
*This research was supported in part by the Atomic Energy Commission under Research Contract No. AT(1 1 — 1 )— 1 493 . and
by the Department of Defense under THEMIS Grant No. F44620-69-C-0116.
431
432 p - B - ZWART
steepest ascent. Section VII presents the suggested modification to Rosen's method. Section VIII re-
lates this material to the methods of feasible directions discussed by Zoutendijk. Concluding remarks
are given in Section IX.
II. NOTATION
Let F and <£,, i— 1, . . ., k be as described in Section I. Then VF(x), V </>,(*) denote the respec-
tive gradients at the point x= (x\, . . . , x n ).
If I q ={i], . . ., i q } is a set of indices taken from 1, . . ., k, then P q {x) denotes the projection
operation which maps any vector into its projection onto the subspace perpendicular to the subspace
spanned by the V (/>,(*), ielq. For instance, P q (x)\/F(x) is the projection of VF(x) onto the subspace
perpendicular to the subspace spanned by the S/tpiix) , ielq. A unit vector in the direction of P q (x)S7F(x)
is denoted by d q (x). Finally, in cases where the argument x is obvious, the "(x)" is omitted. Thus,
VF, V (/>,-, P q , d q are used in place of VF(jc), V (/>,(*), Pq(x) , d q (x).
III. ROSEN'S GRADIENT PROJECTION METHOD
Rosen's method of choosing a search direction is described below. It is assumed that the feasible
point x is given (this point is usually the result of a one-dimensional search starting from some other
point). We are only concerned with how a direction is chosen for the next one-dimensional search. Let
I q —i\, . . ., i q be the set indices i for which |</>,-(;c)| < 8,.
The Si's are preassigned small numbers. It is assumed that the V<f>i(x), ielq are linearly inde-
pendent vectors. Rosen's Gradient Projection method chooses the search direction as follows:
We have P q {x) VF(x) = VF(x) - £ r,-V<M*)-
k 'q
(Note: The vector r with components n, ielq, is given by r= (N q N q )~ 1 N' q ^7F(x) , where N q is the
matrix whose rows are the vectors V</>i(.v), i^q-)
1. If \\P q {x) S7F(x) || is "very small" (described in detail in Rosen's papers) and n =S 0, iel q , then
do not search any more.
2. If ||P Q (;t)VF(x)|| is not "small" (described in detail in Rosen's papers), then use the direction
of P q (x)S7F(x) as the search direction.
3. If \\P Q (x)S7F(x)\\ is "small" (described in detail in Rosen's papers) and some r, > 0, ielq, then
drop the index / from the set I q to obtain a new set I q -\. Use the direction of P q -i, (x) VF(ac) as the
search direction. The index / is determined by
WMx)\r™™\WMx)
The details of what is meant by "very small" and "small" are not included here because the
discussion in this paper is independent of these details. This paper is concerned with steps 2 and 3.
Steps 2 and 3 say that / is dropped from the set of indices I q , only if \\P q (x)'\7F(x)\\ is small.
Actually, as will be shown below, a better search direction (i.e., one which yields a greater rate
of increase of F) can be obtained by dropping /, whenever
n
max
llv<M*)ll w f
l|v<M*)||j
is positive.
NONLINEAR PROGRAMMING GRADIENT PROJECTION
433
IV. ACTIVE CONSTRAINTS
Fig. 1 illustrates a situation in which the projected gradient direction, while acceptable, is certainly
not the best choice of search direction. Clearly, a greater rate of increase in F is obtained by searching
in the gradient direction. In such a situation, Rosen's method would choose the poorer search direction.
FEASIBLE REGION
P,(X)VF(X)
Figure 1. A projected gradient.
Fig. 2 illustrates another situation in which Rosen's method would make a poor choice of search
direction. The vector, P->(x)VF(x), the projection of S/F(x) onto the intersection of planes
Xi — 3.T 2 + .t 3 = 18 and 3xi — x-z + x 3 = 20 would be chosen as a search direction by Rosen's method.
Clearly, Pi(x)VF(x), the projection of S7F(x) onto the plane 3xi — ^ 2 + ^ 3 = 20 gives an acceptable
search direction which yields a greater rate of increase in F.
X 3 "X|-X 2 +IOX 3 slOO
VF(X)
FIGURE 2. Alternate projections of gradient
The situation in Fig. 1 may seem unreasonable because one would not expect some previous one-
dimensional search to stop at x. However, the situation in Fig. 2 is reasonable. The point x could very
well be the result of a previous one-dimensional search along x\ — 3x 2 + x 3 = 18. Such a one-dimensional
search would stop at x because further searching would violate the constraint — 3*i + x-> — x.\ 3= 20.
434 p - B - ZWART
Let us designate a constraint, 4>i 2= as "active" at x, if |0,-(.v)| =£ 8,-, where the Si's are preas-
signed small numbers. In other words, the ith constraint is considered active at x if the hypersurface
<4, = appears to be close to x. (A better estimator of closeness would be . ' ' ■ . -, But this is not
\\\/4>i{x)\\
the primary concern of our discussion.) Rosen's method chooses the next search direction by taking
the projection of \7F{x) onto the intersection of the hyperplanes which are tangent to the surfaces
<}>i(x) = (f)i{x), for all active constraints (except when this projection is "small"). The examples in
Figures 1 and 2 indicate that sometimes it is better to take the projection of S7F(x) onto an intersection
of a subset of these tangent hyperplanes. The remainder of this paper describes how to recognize this
situation and what to do about it.
V. PROJECTIONS
Let us recall a few facts from linear algebra, which are used in the proofs of Theorems 1 and 2.
Let W\ D W-i be linear subspaces of £". Let Pi and P 2 be the projection mappings associated with
W\ and Wi respectively. For any vector A,
(1) PiA PiA=PiAA ^AA, i=l, 2 with < holding whenever A +PA.
(2) P,P l A = P 2 A.
(3) For any vector BeW\,
A • P A A • B
no a\\ ^ iidii with = holding only when B = aP\A, a real and positive.
\\PiA\\ IpII
These facts follow readily from the fact that A —PiA is perpendicular to Wi, i=\, 2.
VI. STEEPEST ASCENT VS PROJECTED GRADIENT
The steepest ascent direction at a point x is given by the unit direction vector, y, which maximizes
the rate of change of F{VF(x) • y) and does not violate the hyperplanes which are tangent to the
active constraint surfaces (i.e. V <£,-(.*) -y 3= 0, for active </>,-'s). Thus, the steepest ascent direction is
given by that vector d which solves the following nonlinear programming problem:
2 MaxVFUW
subject to: S7(f>i(x) -d 3= 0, for (f>, ^active, i.e. iel d-d=l.
This direction is the locally best search direction in the case of linear constraints. (Of course, any
one-dimensional search in this direction must be accompanied by small corrections to insure that the
nonlinear constraints, $, 5* are not significantly violated during the search.)
Hadley [1, p. 301] mentions that although the solution to (2) can be obtained (by solving a quad-
ratic programming problem), simpler procedures can be used to find a d which satisfies the constraints
in (2) and has VF(x) • d> 0. Gradient Projection is one of several methods for finding such a d. The
following theorem indicates that the solution to (2) is given by the projection of the gradient onto a
suitable subspace.
THEOREM 1: The solution to (2) must be a unit vector having the direction of P q (x)S7F(x) for
some /,= {i\, . . ., i q ).
PROOF: Let d be the solution to (2).
PaVF
Set I q = {i\\/4>j-d — 0}, and d q -
linVFH
NONLINEAR PROGRAMMING GRADIENT PROJECTION 435
We will show that d—d q . Suppose not. Then, since d is a unit vector with S/(f>i-d — 0, iel q , we have
by V(3),
(3) VF-d = P Q VF-d<P Q VF--^^ l <P q VF-d q =\7F-d q .
Now, for any y > 0,
V(f> r (d+yd q ) = 0, iel q .
And, since V(f>rd> 0, i^I q , y can be chosen small enough so that
\/(f)i-(d + yd q ) >0, i$I q .
Let y > be such a choice of y. Then, d= ,, , . - ^.i is a feasible vector for (2).
\\d+yd q \\
Moreover,
This last inequality contradicts the fact that d is the optimal solution to (2). Hence, we must have that
d=d q . Q.E.D.
Theorem 1 implies that (2) can be solved by suitably determining I q . If we assume that the V$,'(x),
iel (i.e., the gradients of the active constraints) are linearly independent then the next theorem char-
acterizes the proper choice for I q .
THEOREM 2: P q (x)VF(x) (=VF(*) -]£ T$7<f>i(x)) gives the direction of the solution to (2)
if and only if ' e/ <?
i) P q (x) VF(*) • V<M*) > 0, if/,, and
ii) n =£ 0, te/,.
P VF
PROOF: Set d,--
r,v/ni
Sufficiency. Let d be any feasible vector for (2). Then, by use of the equation in the statement of the
theorem,
VF-d = P q VF-d+^ nV<t>i-d
=s P q VF ■ d
*£ \\P„V F\\= Ptf F ■ d q = V F ■ d q
Therefore, d q maximizes VF • d.
Necessity. Condition (i) is necessary because d q must be a feasible solution to (2).
Suppose {ii) does not hold. Let jel q be such that r, > 0. We will show that there is a feasible J for which
VF-d q <VF-d.
First, let /,- = {iel\ V0, ■ F„VF = 0}.
436 p - B - ZWART
Notice that P q .VF = P q .P q V F = P q VF.
The first equality follows from V(2) and the fact that I q C I q '. The second equality follows from
the method of constructing I q *.
Set I q '-\ = {i\iel q ' and i + j}. We claim
(a) S7(f>j ■ dq.-i 2* and (b) VF ■ d q >-i > VF ■ d q .
To see (a), we observe that
P q ^F ■ P q KF=P q .-NF ■ P q .VF
=P q ,- l VF-P q VF
-P^-.VF-VF-^ nP q '-iVF • V0i
= F 9 ._,VF • S/F-rjPg'-i^F ■ V<fo.
= P,._ 1 VF-/ > ,.-,VF4-r / P ff ._,VF-V0 j
i.e. ||P,.VF|| 2 =||P,'-,VF|| 2 -rjP,._,VF-V0j.
i.e. P,--iVF-V0j = - (IIFq'-jVFU'^-IIF^VFll 2 ) 2=0. Where this last inequality follows from the
fact that r, > and /q- C I q >-\. Assertion (a) now follows because rfq'-i is a unit vector in the direction
of P^-.VF.
To see (b), first notice that if P 9 '-i VF = F 9 'VF, then we would have
VF- ^ s,V<£; = VF-2 nV^i
i.e., rjV<f>j(x)= £ s,-V0,— £ rrV^-rj + O.
f+j i*j
This last statement contradicts the linear independence of the V </>,-. Thus,P 9 '_iVF 4= P Q '^7F, and since
P q .VF = P q .P g .-iVF, by V(2),
we must have ||P,-_,VF|| > ||P,-VF||, by V(l).
But ||P,VF|| = P.-VF ■<*,-, i= 9 ', 9 '-l. So
P,-_,VF-d,'-,>P 9 '_,VF-d r -,
i.e.,VF-d,<_, >VF-d,-
But d Q ' = d q because, as noted above, P q 'S/F = P q S7F. This gives us (b).
NONLINEAR PROGRAMMING GRADIENT PROJECTION 437
From the definition of /,,< and I q '-\ it follows that V</>, ;■ d q = 0, iel q ', and V$, ;• d q >-\ = 0, ieI q <-\.
This, combined with assertion (a), says that, for y > 0,
V (/>,•• (d, + yd,._i) 2=0, te/q'.
Then, since V</>, • <f q > 0, iel — I q ', we can choose y = y small enough so that V<£, ■ (d q +yd Q '-\) > 0,
iel — l q >. Thus
j d g + yd q <-i
d= n,j _l-^ FT
is a feasible vector for (2).
Also, using (b), we have
V ^ • d= I. , , h FT ( VF • d q +yVF ■ <*,-_,)
||rfq+ya 9 '-i||
> .I , * fi (l + y)VF-d,
> VF • <f 9 .
We have a contradiction to d q being a solution to (2). This completes the proof of the necessity.
Q.E.D.
VII. MODIFIED GRADIENT PROJECTION
This section describes a simple modification to Rosen's method of gradient projection. This
modification helps to overcome the deficiencies which were described in Section IV. The modification
is justified by the following corollary to Theorem 2.
COROLLARY TO THEOREM 2: If /„ = / and P 9 VF = VF-^ r,V0 ( , where some rj > 0, then
it l
d q -i(I q - l =I q — {j} is a feasible solution to (2) with VF ■ d q -i > VF ■ d q .
(i.e., P q -iS/F gives a better search direction than F q VF.)
PROOF: This result follows from the proof of the necessary condition in Theorem 2.
The above corollary suggests the following modification to the method of gradient projection.
Steps 2 and 3 of Section III should be replaced by the following step:
2'. If some r, > 0, iel q , then drop the index / from the set I q to obtain I q -\. Use the direction of
P 9 _iVF(x) as the search direction. The index / is determined by
ri [ r,
max
|v</>,u)ll "«7;ilrv<M*)
This modification gives an improved search direction in the sense that it gives a direction of
greater rate of increase. In addition, if the recalculated r,'s are all negative, then a "steepest descent"
direction (the solution to (2)) is obtained. The termination of a search iteration usually involves the
encountering of no more than one new constraint surface. Thus, permitting one index to be dropped
should usually maintain the r,'s as negative. Of course, this will not always be the case.
438 p - B - zwart
VIII. Zoutendijk's Discussion
G. Zoutendijk [4] discusses several algorithms for handling nonlinear programming problems.
The algorithms are called "Methods of Feasible Directions." These algorithms fall into the general
framework of a sequence of one-diminsional searches. They differ primarily in the method of choosing
the search direction. The choice of search direction involves the use of a normalization. Different
normalizations yield different search directions and, consequently, different algorithms.
Zoutendijk's normalization Nl [4, p. 70] together with a choice of 0, = O, iel c (xi,-), for determining
S'(xk) [4, p. 68, Eq. 7.3.6] yields a search direction which corresponds to the direction of steepest
ascent. Of course, choosing 0, = O, does not fit in with Zoutendijk's methods because he chooses to
avoid the necessity of calculating correction steps during the one-dimensional search. Nevertheless,
Zoutendijk's discussion of normalization Nl is closely related to the results of this paper.
In his discussion of normalization Nl [4, section 8.2, p. 80], Zoutendijk mentions that the re-
sulting direction finding problem is of the same type as (2) above [4, p. 81, Eq. 8.2.3]. Thus, his results
and comments are applicable to the solution of (2). He discusses four methods of solving this problem.
Two of the methods are finite and are based on the work of Charnes and Dantzig, and Orden and
Wolfe. The other two methods devised by Zoutendijk are not finite, but are expected to be usually
more efficient. This serves to illustrate that there is evidently no easy way to solve (2). Thus, the small
modification to gradient projection proposed in section VII of this paper can be considered as a compro-
mise between either solving (2) exactly or accepting the first gradient projection as proposed by Rosen.
IX. CONCLUSION
The direction of steepest ascent in constrained problems has been shown to be the direction given
by the projection of the gradient onto a suitably chosen subspace. Moreover, the suitable choice is
recognized by the negativity of certain coefficients which arise during the calculation of the projected
gradient.
The above results give rise to a slight modification to Rosen's gradient projection method. The
modified projection method usually yields a search direction which is the direction of "steepest ascent."
The modified method when employed has the following favorable qualities:
1) The new projected gradient direction is better (has a greater rate of increase of the objective
function) than the original projected gradient direction.
2) The number of constraints being followed is reduced. Thus calculations of correction steps and
future projections are simplified because the matrix N%N q (See [3]) is smaller.
REFERENCES
[1] Hadley, G., Nonlinear and Dynamic Programming (Addison-Wesley, Reading, Mass., 1964).
[2] Rosen, J. B., "The Gradient Projection Method for Nonlinear Programming, Part I," SIAM Journal
8, 181-217 (1960).
[3] Rosen, J. B., "The Gradient Projection Method for Nonlinear Programming, Part II", SIAM Journal
9, 414-432 (1961).
[4] Zoutendijk, G., Methods of Feasible Directions (Elsevier Publishing Company, Amsterdam, 1960).
SIMPLE STOCHASTIC NETWORKS: SOME PROBLEMS AND
PROCEDURES*
J. M. Burt, Jr.
University of California
Los Angeles, California
D. P. Gavert
University of California
Berkeley, California
and
M. Perlas
Carnegie-Mellon University
Pittsburgh, Pennsylvania
1. INTRODUCTION
An extensive literature now exists describing and solving various problems related to aspects of
project graph analysis. Attention is called in particular to the papers of Charnes, Cooper, and Thompson
[2], Hartley and Wortham [5J, Kelley [7], Martin [9], and Jewell [6], although there are many others as
well. We mention also the systematic book-length treatment of Moder [10].
Initially, project graph analysis (also mnemonically called PERT, GERT, CPM, etc.) assumed
that individual task completion times were fixed and known in advance. The unreality of such an as-
sumption in many contexts is apparent. Consequently attempts to introduce the probability distribu-
tions of completion times have been made; noteworthy for such work are [2], [5], and [9] above. Thus,
for example, one may be interested in the expectation or other summary of the entire project's comple-
tion time as the latter depends upon the distributions of the inidvidual task times. Alternatively, it
may be of interest to seek the probability that a particular path will be "critical," i.e., will be the last
to be completed. Decision problems of one sort or another may be considered; cf. [2] and [6].
This paper deals with stochastic network problems of the sort just mentioned. Since most stochastic
networks of realistic scale are practically impossible to analyze mathematically, we begin by discussing
simulation, in particular illustrating the gains possible by use of Monte Carlo approaches for improving
naive "straightforward" simulation techniques. Considerable improvement in simulation efficiency
can be made if an approximate, analytically tractable, model is available, so we next introduce such
models and illustrate their use as a "control variate." The analytical network models suggested utilize
families of task length (link time) distributions based on the simple exponential density, and profit
by the Markovian or memoryless property associated with the exponential.
*This research was conducted in part under National Bureau of Standards Contract CST 136 at the Operations Research
Center, University of California, Berkeley. The work of the authors was also sponsored by the Management Sciences Research
Group, Carnegie-Mellon University, Pittsburgh, Pennsylvania.
tVisiting faculty member during the year 1967-1968.
439
440 J- M. BURT, Jr., D. P. GAVER, AND M. PERLAS
2. SOME MONTE CARLO SIMULATION TECHNIQUES
It seems safe to say that only a minority of the actual networks encountered in practice can be
adequately studied using pencil and paper mathematics. This is not to state that analytical approaches
are useless: they can, and should, be used to reduce the complexity of parts of the network prior to a
computational approach, and they can offer plausibility checks. Ideally, one would like to obtain
enough insight into the behavior of stochastic networks so as to largely eliminate the need for formal,
perhaps computerized, studies. One way to obtain such insight is to examine various realistic networks
by means of sampling experiments, using a computer. Since this is also a popular way of solving specific
problems, a discussion of various techniques is given next. Our intention is to describe certain Monte
Carlo methods for efficiently dealing with the randomness present. The general purpose is to obtain
good determinations of, say, project completion time expectations or distributions while keeping
computational effort low. The methods we discuss were first put to use in physics; see the book by
Hammersley and Handscomb [4].
O : -O : KD
Figure 1.
A. Antithetic Variables
To explain this notion, refer first to the network of Figure 1 and suppose we wish to estimate the
expected completion time of T = T\ + T> by simulation, where T\ and T> are assumed to be independent.
The discussion which follows is valid for any size network composed of activities whose durations are
independent. This small example has been chosen solely to simplify the presentation. The straight-
forward procedure is to select two uniform random numbers, R\ and R>, transform to realizations of
T\ and T-z, perhaps by the familiar procedure,
T, = Fr(R),
where F, is the distribution of T-„ and Ff l its inverse. Then the first realization is
7x1):= yd) + jKi)-
we could tabulate n such realizations using random numbers and average to obtain the estimate
fU) _|_ yto _|_ y(2) _|_ f(D _|_ _|_ fin) 4. j(u)
(2.i) r=-! = ! *- '— ! 2 —
All random numbers are independent, and the n times are generated from their appropriate
distributions, so
(2.2) E[T]=E[Ti] + E[T 2 ]
and
(2.3) Var [f ] = Var [r,] + Var [r 2 ]
SIMPLE STOCHASTIC NETWORKS 441
In other words, the simple procedure described gives an unbiassed estimator of E[T] whose variance
decreases as 1/n. If the procedure described is repeated many times (n-^> °°), then T becomes
arbitrarily close to E[T]. Unfortunately, if there are many independent serial links then the sum of the
variances in the numerator of (2.3) becomes large, and many repetitions are required in order to deter-
mine E[T] accurately. An attempt to avoid this is in order.
Notice first that in order to estimate E[T] the realizations T\ j) + Vip and T^ ) + T^ need not be
independent so long as they have the correct marginal distributions. Intuitively, too, one sees that if,
when T\ is "large" in one realization, it is forced to be correspondingly "small" in another, then the
average will tend to be closer to the true value E[T] than in the case of purely independent samples.
To accomplish this, we can construct two realizations of T= T\ + T> using the same two random num-
bers: first obtain R t and R>, thence T\ and T>, and finally T; next subtract Ri and R> from 1 to obtain
1 — R, = Rl (i = 1, 2)*, and from these T[ and T'i, finally adding to get T . T\ and T-> are the antithetic
variates of T\ and T>. Lastly, average:
yd) _|_ yd)' _|_ yd) + yd)' + _|_ y(») _|_ y(»)' _j_ fin) + fw
(2.4) T A - - 1 ! 2 - ^~
2n
yd) _)_ j(i) _|_ _ _ _|_ f(n) _)_ y(«) yd)' _)_ yd)' _)_ _ _ _|_ y< »)'_)_ y( H )'
= -(T+T').
2 V '
Now by construction, E[f] =E[T '] =E[T], so the estimate f A is unbiassed. Furthermore,
(2.5) Var [f] = Var [f , ] = Var [7,] +Var [P,]
n
However, it should be intuitively apparent that t and t' tend to be negatively correlated, i.e.,
cov [T, T'] < 0. (This property may be proven easily.) Since
Var [f,]=|(Var [f]+Var [f])+|cov [T, f ] = WaT W +1 cov [f, f ] < Var [f] ,
(2.6)
this means that the above simple procedure is more efficient than doubling the total number of in-
dependent realizations computed. One can, of course, estimate the variance of Ta by simply computing
the sample variance of the n independent averages
y(o + yo)+y(0' + y(0'
*-— — (*=1, 2, . . .,»).
Since Ta is the average of n independent terms, approximate confidence limits may be placed on
isjT] using the Student t tables.
The use of a few special random variate generators, such as the Normal variate generator using trigonometric functions,
should be avoided when the antithetic procedure is applied.
442 J- M. BURT, Jr.. D. P. GAVER, AND M. PERLAS
The dramatic effect of antitheticizing a sum is most apparent when Tj is symmetric, e.g., if T,
can be assumed uniform, or normal. In the latter case we can write
(2.7) 7f=mi+ar£i,
£; being distributed as N(0, 1); m, and cr, are mean and standard deviation of Tu Then the obvious
antithetic is obtained by simple sign reversal of £,:
(2.8) T; = mi-cnZi,
and an average ot — with already yields a zero-variance estimate of m,.
For the latter example, and for others as well, it should be apparent that the use of antithetic vari-
ables is helpful in efficiently establishing the appearance of a distribution function. One first determines
an empirical distribution, F(t), from the TVs, and a second empirical distribution, F'{t), from the
TV's. The antithetic variate estimate of the true distribution would then be
,2.9, /(,) = *" + /'"> for.]],.
If F(t) is overly skewed (relative to the true distribution) to one side, then F'(t) is likely to be overly
skewed to the other side, and F{t) will be close to the true unknown distribution. It should be noted
that estimates of Var (T) should not be formed by simply averaging the sample variances of the
antithetic variates; such an estimate may be quite biased. The authors have had some success in elimi-
nating this bias by application of the Tukey-Quenouille jackknife procedure.
Figure 2.
As a second example of the use of the antithetic technique consider a simple parallel network,
e.g., Figure 2. The goal is to determine the probability with which M=max (7\, T>, . . ., 7*) is not
greater than some number x. For simplicity, let all 7V s have the same rectangular distribution: T, is
uniform on [0, 1]. This is no restriction in fact, for, even if the 7Vs were arbitrarily but identically
distributed, M =S x if and only if R, =s y, for all i.
To determine the above probability in a straightforward manner we can sample independently
for each component link. Then tabulate 8, where
{1 if Ri<x, R><x, . . .,Ri,<x
[0 otherwise.
SIMPLE STOCHASTIC NETWORKS 443
Letting 8 (j) refer to theyth realization (j—1,2, . . ., n) , the estimate of the desired probability is
(2.10) g= «■■>+«»>+■ . ■+«'-»
ft
It is simple to see that
(2.11) £[§]=*'•
and
_ x k (l — x k )
(2.12) Var [8] = * U * J .
The ratio
(2.13)
Var [g] _jc~ a — 1
£ 2 [S] «
measures the fractional or percent error of the estimate. Plainly a large k tends to make the above
error measure large, requiring an enormous n to obtain a good determination.
An antithetic approach might go as follows. With 8 associate the antithetic indicator 8 ', where
(214) g riif U-/?.)<*, (1 -«*)<*, . . ., (1-R k )<x
[0 otherwise.
Obviously 8 and 8' cannot be unity simultaneously if x < 1/2; for simplicity x < 1/2 will be assumed.
Now form the antithetic estimate
, oie , - _8< 1 > + 8< 1 >' + S< 2 > + 8< 2 >'+ . . . +8<"> + 8<">'
(^•lt>) OA ^
Clearly 8 + 8' is unity with probability 2x'\ Hence
(2.16) E [h.,]=^P^ = x>\
Zn
so the estimate is unbiassed. Furthermore
(2.17) Var [8,] = ^ILZL 2 ^!!.
Zn
The proportional error of the estimate is
(2 18) Var [8,,] = x~ k -2 1 Var [8]
E*[8 A ] '2n "2 £2 [s]
Of course the above antithetic procedure is merely one possibility. Another is to antitheticize on
some, but not all of the links simultaneously. There are in all 2'' such possibilities.
444
J. M. BURT, Jr., D. P. GAVER, AND M. PERLAS
B. Stratification
The Monte Carlo technique, called stratification, is an extension of the concept underlying
antithetic variates, i.e., the creation of parallel, negatively correlated, realizations that are to be
averaged. To simplify the exposition, let us initially assume our network contains a single activity
whose time-to-completion is denoted by T%. As mentioned above, realizations of T\ are created by taking
uniform random draws over the unit interval and transforming them, usually via the inverse distribution
function. "A>way stratification" is performed in the following manner. Divide the unit interval (0, 1)
underlying realizations of T\ into k disjoint and exhaustive subintervals; as a practical matter, one
usually uses subintervals of equal lengths although other choices are available (See [4, p. 56]). Draw a
uniform random number, R, and compute k parallel realizations of T\ from the random numbers,
R/k (l + R)/k, . . ., (k—l+R)/k. Randomly assign these realizations of T\ to k parallel simulations.
Repeat the procedure by drawing a second random number, and so on.
The above procedure leads to k parallel sets of realizations of T\, and it can easily be shown that
within any one of these sets, realizations of T\ are independent. However, between the parallel sets of
7Vs there may be considerable negative correlation. Thus, if one set has an unusually large number of
low realizations of T\ (i.e., lower than E{T\)), some other set(s) will have a correspondingly small
number of low realizations of T\. By taking averages across the sets, low variance estimates, of say
E{T\), may be obtained.
For networks containing many activities, stratification may be performed on some or on all of the
activity times and different A;'s may be chosen for different activities. There is considerable "art"
involved in selecting which activity times should be stratified and how detailed (choice of k) such
stratification should be. Generally, for activity times which have relatively large variance and activities
which have a high probability of lying on the critical path, one would use fairly detailed stratification
(about 4- or 5-way).
C. Control Variates
Suppose we are interested in estimating the distribution function, or a parameter thereof, of the
completion time of Network I below. It would be extremely arduous to compute the completion time
distribution of that network by analytic methods. The control variate procedure requires that we
construct a simplified network "similar" to Network I, such as Network II (see Figure 3).
NETWORK I
NETWORK H
Figure 3.
Under certain assumptions (see Section 4) concerning the link time distributions, Network II may
be analyzed by exact methods. The general idea of the control variates technique is to exploit the
similarity between I and II in order to improve estimates concerning I; in doing so, use is made of the
exact knowledge of II. In other words, simulation is used to correct a known result for II to bring it
close to I. Again the key to the correction procedure is to reuse the basic random numbers to create
SIMPLE STOCHASTIC NETWORKS 445
parallel realizations. However, this time the attempt is made to achieve a high positive correlation
between the behaviors of I and II.
To clarify these ideas, suppose we are interested in estimating the expected completion time,
E[Mi], of Network I. Suppose further, that the distribution functions corresponding to activity times
in Network I, except that of T 7 , are amenable to mathematical computations, e.g., exponentials, uni-
forms, gammas, etc. We might construct the control Network II as follows: Let Ta = E[T\] +Ti;
T B = E[Ti]+T s ; T c = a random variable highly (positively) correlated with T 7 , but possessing a simpler
(easier to manipulate mathematically) distribution function than that of T 7 ; let T>, T$, and T$ be as in
Network I. We then run parallel simulations of the time to complete each network using the same set
of random draws for corresponding link times; i.e., when we draw a random number to compute 7\,
we use the resulting value in computing T.\ and 7V Similarly, we use the same random draw in com-
puting T 7 and 7V. This then implies that a positive correlation tends to exist between Mi and M//. Notice
that since
(2.20) E[M I ]=E[M„]+E\_M I -Mu],
it is possible to estimate E[Mi] by computing E[Mn] analytically, and then estimating E[M, — Mn]
from a sampling experiment. The latter step involves obtaining n realizations: Mj'^ and Mjp (t=l,
2, . . ., n) and averaging:
1 A
(2.21) M l -M„ = ± '2(M}»-M}j
Since the link times shared the same random numbers, M\ n and M\'/ are positively correlated. Thus
(2.22) Var [M,-M„] =~ { Var [Mi] + Var [M„] ~ 2 cov [M,, M„] } .
n
Hence, it follows that if
(2.23) Var [M„] < 2 cov [M,, M„],
a reduction in variance over a straight simulation has been achieved. Of course the degree of the
reduction depends upon the skill with which the approximating network, II, is selected and analyzed.
Usually there is some freedom for choice of II and the latter will have many fewer links than I. One
method for forming the control variate is to use a subnetwork of the original complex network. The
subnetwork can be formed by deleting those activities that have low probabilities of being on the
critical path.
Certain natural extensions of this method suggest themselves:
(a) As described, the control variables method applies straightforward Monte Carlo to estimate
the correction to E[Mu], i.e., the second term of (2.20). Sometimes this estimate can be further im-
proved by supplementary use of antithetic or stratification techniques; see Example 4 in section 3.
446 J- M - BURT, Jr., D. P. GAVER, AND M. PERLAS
(b) There is no reason to restrict attention to one approximating network. Suppose another III
is introduced. Then
(2.24) E[M,] =w 1 E[M„] +w 2 E[M,„]+ Wl E[M, - M„] + w 2 E[M, - M m ],
where the weights w\ and w-ziwi + w 2 = 1) may either be chosen equal to one-half, or, perhaps better,
estimated by making use of the regression technique, see Hammersley and Handscomb [4, p. 66]. In
the latter we seek W\, w-i to minimize
(2.25) Var OiAi + ma>A 2 ] = w\ Var A, + 2mw 2 cov (A,, A 2 ) + w?g Var [A 2 ],
where W\ + w> = 1 and Ai = Mi — Mn, A 2 — Mi — Mm- Although the latter variances and covariances
are unknown, they may be estimated from the sequences of individual realizations {M ( p — M { })} and
{A/'/' — M\'^}. It seems possible that bias introduced by such a procedure for finding weights may be
effectively reduced by application of the Tukey-Quenouille jackknife procedure. Finally, there is no
necessity for limiting the analysis to two approximating networks. It should be mentioned that one very
simple — and traditional — control variate derives from the original PERT procedure of finding that path
through the network the sum of whose expected link time is longest. Then all other paths are ignored
and the distribution of network completion time is taken to be normally distributed with mean equal
to the sum of the means of the activities on the above "critical path," and with variance equal to the
sum of the variances. MacCrimmon and Ryavec [8] and Van Slyke [11], as well as others, have examined
the bias of such an approximation; the control and antithetic variables procedures advanced here
should be capable of considerable improvements over this PERT approximation.
(c) The various techniques mentioned have emphasized the determination of the mean or expected
time to complete a PERT network; however, they apply equally well to the estimation of the expecta-
tions of other functions. One such is
/ o^ ( 1 for M 3= x
(2.26) <P(M) =
I for M < x;
the expectation of the latter is just the probability that M < x, which may be of use. In the event that x
is rather large (or small), then application of the method of importance sampling may be of use in
improving precision.
The success of the three Monte Carlo techniques discussed above will vary from network to net-
work. Indeed, even on the same network, the success of the techniques will vary from experiment to
experiment due to randomness of the draws generated for an experiment. In the section to follow
several examples of these techniques are illustrated nuiherically so that the reader may get a feel for
the improvements that are "typical." In the process, control networks involving exponentially dis-
tributed link times are employed. Methods for handling such networks analytically (by Markov property
and Laplace transform) and numerically (by numerical transform inversion) are relegated to a later
section.
3. EXAMPLES
To make a beginning, the following very simple networks, Figure 4, were treated by means of the
antithetic and stratification procedures.
SIMPLE STOCHASTIC NETWORKS
447
EXAMPLE 1:
NETWORK I (WHEATSTONE)
-K>
NETWORK 2 (SERIES)
-*C^-K>
NETWORK 3 (SERIES -PARALLEL)
♦O
T 2 T 4 T 6 T,
Figure 4.
The distributions for link times (7 1 ,) are identical and independent exponentials
(3.1) P{Ti>x}= e-^ 10 forx^0
= 1 for x < 0.
All of these examples may be analyzed analytically, and the actual expected completion times and
variances, etc. have been calculated.
As a check of the methods suggested, antithetic and three-way stratified sampling was applied,
as in section 2, A and B, to estimate the expected completion time. The results concerning variances
reduction are summarized in Tables 1 and 2.
/ T+T'\
Table 1. Antithetic Variates: Variance Reduction [T* = antithetic realization, — - — I
Network
(1)
No. of
realiza-
tions, A'
(2)
\£r(r)
N
(3)
Var(7")
N
(4)
Average of
columns
3 &4
(5)
(6)
v2r(r*)
N
(7)
Xt
(8)
1
1'
1
1
50
100
200
400
5.341
2.830
1.526
0.760
5.986
2.650
1.443
0.691
5.664
2.740
1.485
0.726
5.774
2.887
1.443
0.722
1.849
0.852
0.428
0.227
3.12:1
3.39:1
3.37:1
3.18:1
2
2
2
2
50
100
200
400
10.110
5.133
•2.375
1.233
10.724
4.781
2.531
1.205
10.417
4.957
2.453
1.219
10.000
5.000
2.500
1.250
1.814
0.905
0.444
0.225
5.51:1
5.53:1
5.63:1
5.56:1
3
3
3
3
50
100
200
400
10.486
4.685
2.601
1.319
9.537
5.553
2.388
1.243
10.011
5.119
2.495
1.281
10.000
5.000
2.500
1.250
3.644
1.855
0.829
0.422
2.74:1
2.70:1
3.02:1
2.96:1
„ . „ . Var(D Var(r)
Column A i is rr- — — '(computed analytically)
A
Y
Column X-i is the ratio describing variance reduction; X\ divided by
v7r(T*
448
J. M. BURT, Jr., D. P. GAVER, AND M. PERLAS
TABLE 2. Stratification: Variance Reduction
ion(r** =
stratified realization,
T+T' + T'
Network
(1)
No. of
realiza-
tions, N
(2)
Var(D
N
(3)
Var(7")
N
(4)
Var(r')
N
(5)
Average of
columns
3, 4, & 5
(6)
(7)
Var(7'**)
N
(8)
X-,
(9)
1
1
1
1
50
100
200
400
6.368
2.846
1.464
0.758
5.869
3.091
1.595
0.704
5.761
2.939
1.355
0.693
5.999
2.959
1.471
0.718
5.774
2.887
1.443
0.722
1.171
0.631
0.319
0.153
4.93:1
4.58:1
4.52:1
4.72:1
2
2
2
2
50
100
200
400
9.565
5.090
2.479
1.235
9.831
4.713
2.411
1.243
10.235
5.064
2.457
1.203
9.887
4.956
2.449
1.227
10.000
5.000
2.500
1.250
1.946
0.934
0.454
0.229
5.14:1
5.35:1
5.51:1
5.46:1
3
3
3
3
50
100
200
400
9.942
5.037
2.754
1.378
10.737
5.569
2.432
1.225
10.001
5.561
2.542
1.272
10.226
5.389
2.576
1.292
10.000
5.000
2.500
1.250
2.422
1.258
0.592
0.311
4.13:1
3.98:1
4.22:1
4.02:1
Column X\ is
Var(D Var(7") Var(7"')
N
N
N
Column X-2 is the ratio of variance reduction; X\ divided by
Var^**)
N
As can be seen, the variance reduction obtained by utilizing the antithetic or stratification procedure is
greater than that from doubling the sample size and sampling independently.
We remark, in passing, that the estimate of completion time variance obtained by simple averaging
of the variance estimates obtained from the antithetic experiments, i.e., the estimate in Column 5 of
Table 1, is likely to be biassed. Agreement with the calculated values seems, however, to be quite
satisfactory for the present example.
In the second example control variates are applied to Network 1 (Wheatstone) above.
EXAMPLE 2: Obvious possible control networks to use in connection with Network 1, as shown
in Figure 5, are:
control i *
CONTROL 2 *
Figure 5.
*The random variables, T t , in the control networks are identical to those in the original Network 1.
SIMPLE STOCHASTIC NETWORKS
449
The completion time for Network 1 may be written as
(3.2) r=max (T, + T>, T, + T, + T 5 , T 4 + T 5 ),
while that for Controls 1 and 2 are, respectively,
(3.3) C(l) = T l + T 3 + T 5
C(2) = T 1 + max(T-,, T :i + n)
Again the true values may be computed:
E[T] =34.58
(3.4) £[C(1)] = 30
£[C(2)] = 32.5
Table 3 illustrates the effect of the technique. In the table, the control estimate is
(3.5) fcu)=T+(E[C(i)]-C(i)), 1=1,2.
Here the mean-squared error (m.s.e.) reduction is tabulated in addition to the variance reduction;
the m.s.e. is estimated by averaging the squared difference between the control estimate and E[T~\.
Var [T] = 289
Var [C(l)]=300
Var [C(2)]=294.
TABLE 3. Effects on Variances
Number of r«
:alizations, N
50
100
200
400
f
37.74
33.84
34.01
34.84
Var (T)IN
6.23
2.35
1.47
0.75
C(l)
27.31
29.32
29.13
30.34
T<(t)
35.05
34.02
34.88
34.50
Var (7V >)//V
1.97
0.72
0.41
0.36
Var. Red.
3.16:1
3.27:1
3.59:1
2.88:1
M.S.E. Red.
3.86:1
4.03:1
2.98:1
3.04:1
C(2)
34.84
31.35
32.04
32.70
Tcm
35.40
34.99
34.47
34.64
Var (Tcm)IN
0.82
0.46
0.25
0.15
Var. Red.
7.59:1
5.11:1
5.88:1
5.00:1
M.S.E. red.
5.09:1
4.17:1
4.93:1
4.03:1
The reader will observe that the variance and squared error improvements obtained with the first
control variate are roughly the same as that achieved with the antithetic variate and stratification
procedures; however, the second control variate yielded improvements almost twice as large as those
of the other Monte Carlo techniques.
450 J- M. BURT, Jr., D. P. GAVER, AND M. PERLAS
EXAMPLE 3: This example refers to the network depicted in Figure 6.
Figure 6.
It will be of interest to apply some of our techniques to this problem, since it exhibits greater complexity
than the examples previously treated.
Obviously, complication is introduced by such "crossing" links as link 10, having completion time
Tio. If all link times have the same mean (assumed here) it seems natural to use the partial network of
Figure 7 as a control. Of course, other possibilities (e.g., that involving links 1, 5, 10, 7, 8, and 9) are of
interest as well. It turns out that the control network of Figure 7 can be completely analyzed numerically
(see the next section) if link times are independently and exponentially distributed. Supposing that
E[Ti] = 10, then, we find that
(3.6)
E[Vft\ = 47.5, Var[7™] = 418.8,
where T ( ['l denotes the time to complete the control network of Figure 7.
Figure 7.
SIMPLE STOCHASTIC NETWORKS
451
Table 4 summarizes our sampling results when the number of repetitions is 25.
SAMPLING RESULTS
TABLE 4. Number of Repetitions: iV = 25
Straight
Antithetic
Antithetic
estimate
Control
estimate
Control-
regression
Means
Full Network {Tab):
58.09
50.95
54.5
57.4
57.5
Control Network(TW):
48.2
42.7
45.5
Variances
Full Network:
790
373
208
314
298
Control Network:
689
267
111
The antithetic and control procedures adopted here reduce the straightforward variance of the esti-
mate by one-half to one-third.
The above experiment was repeated with iV = 50 independent repetitions. The relative variance
reductions were found to be quite comparable to those for TV = 25 (see Table 5).
SAMPLING RESULTS
TABLE 5. Number of Repetitions: N = 50
Straight
Antithetic
Antithetic
estimate
Control
estimate
Control-
regression
Means
Full Network
54.6
54.2
54.4
54.2
54.3
Control Network
47.9
46.8
47.4
Variances
Full Network
328
395
105
110
91
Control Network
390
382
116
EXAMPLE 4: Again the network considered is that of Figure 6. The experiments carried out are
related, but somewhat more extensive, than those of the previous example. We have completed inde-
pendent samples of A/=25 and A/=50 (10 of each). From each sample (25 and 50) we computed the
various estimates obtained in the last experiment, and estimates of the variances of those estimates.
Two additions were made to the previous experiment. First, an alternative control network, de-
picted in Figure 8, is utilized for comparison with that of Figure 7.
452
J. M. BURT, Jr., D. P. GAVER, AND M. PERLAS
Figure 8.
The control networks of Figures 7 and 8 will be called the Upper Control Network and the Lower
Control Network, respectively. Second, a combined antithetic-control estimate is added to the list of
those considered. In brief, the antithetic estimators for Tab and T A C ^ are constructed; these are
(3.7)
Tab + Tab j
tab = 7, , and
7KC)_l. fie)'
The same random numbers that appear in T A c d appear also in Tab as was true earlier; their antithetic
versions appear in Tab and T\ C J' . An unbiassed control estimate of E\Tab\ may be created using the
variables (3.7):
(3.8)
E[TAB]A,C = EWJ) + [TAB-f%>]
= E[T%)] + [tab-t a W.
(E[Tab]a,c denotes the combined antithetic-control estimate of E[Tab] )■ Finally, a regression adjusted
estimate of the form
(3.9)
E[tab]a,c,h = -Tab + P(t^-E[T a ^])
is considered, where the constant /3 is so selected as to approximately minimize the mean-squared error
of the estimate; see (2.24). (E[T A b]a,c,k denotes the regression adjusted antithetic-control estimate
o(E[Tab]).
The results of the simulation experiments are presented in Tables 6 and 7.
Examination of the tables reveals the effects of the several variance-reducing maneuvers employed.
Roughly speaking, a halfing of the variance of the estimate results when either antithetic or control
methods alone are used. Present evidence indicates that the Upper Control Network is the more
effective as a control variate. If both the antithetic and the control devices are used simultaneously,
as in (3.8) the variance is halfed again.
A word about the regression-adjusted estimates is in order. It may be shown that the parameter
/3 in (3.9) is best taken to be
j8« =
-Cov \r A c B \ tab]
Var[r<f s >]
SIMPLE STOCHASTIC NETWORKS
453
SIMULATION RESULTS
Table 6. Means: A/= 25
Runs
Straight
Control
Control-
regression
Antithetic
Antithetic-
control
Regression-
adjusted
antithetic-
control
Lower Control Network
1
2
3
4
5
6
7
8
9
10
50.1
56.9
53.9
49.6
54.3
55.1
54.2
54.1
57.6
56.8
50.5
54.6
56.0
52.7
54.3
52.3
53.5
51.8
52.9
56.1
51.2
55.6
55.4
52.5
54.3
52.3
53.5
52.1
54.8
56.4
54.5
54.5
55.0
52.3
51.2
53.4
54.6
56.8
51.8
57.3
51.4
56.4
55.6
53.8
54.1
54.3
54.7
54.1
53.0
56.6
51.4
57.1
55.5
54.0
53.6
54.0
54.7
54.5
52.9
56.7
Average
55.1
53.5
53.8
54.1
54.4
54.4
Var. of estimate
6.2
3.2
3.1
4.0
2.4
2.7
S.D. of estimate
2.5
1.8
1.8
2.0
1.56
1.63
Upper Control Network
1
2
3
4
5
6
7
8
9
10
50.1
56.9
53.9
49.6
54.3
55.1
54.2
54.1
57.6
56.8
57.4
55.1
51.7
53.9
52.1
54.7
55.3
54.8
50.5
55.8
57.5
55.9
52.0
52.7
52.4
54.7
54.9
54.7
52.0
56.2
54.5
54.5
55.0
52.3
51.2
53.4
54.6
56.8
51.8
57.3
56.5
53.7
54.2
53.4
52.5
53.3
55.0
55.9
53.3
56.3
56.5
53.6
54.2
53.4
52.3
53.3
55.0
55.9
53.0
56.4
Average
55.1
54.1
54.3
54.1
54.3
54.4
Var. of estimate
6.2
4.4
3.8
4.0
2.0
2.3
S.D. of estimate
2.5
2.1
1.9
2.0
1.4
1.5
454
J. M. BURT, Jr., D. P. GAVER, AND M. PERLAS
SIMULATION RESULTS
Table 7. Means: N = 50
Regression-
Runs
Straight
Control
Control-
regression
Antithetic
Antithetic-
control
adjusted
antithetic-
control
Lower Control Network
1
54.6
56.0
55.6
54.4
54.1
54.2
2
55.5
56.5
56.2
57.5
54.9
55.6
3
54.7
55.2
55.0
55.5
54.7
54.8
4
56.1
58.6
57.8
55.2
56.8
56.8
5
51.1
51.2
51.2
51.9
54.2
53.6
6
50.5
56.7
54.6
53.3
54.9
54.3
7
58.7
55.4
56.2
55.1
54.6
54.8
8
52.8
52.3
52.3
56.2
53.4
53.9
9
58.9
54.2
55.6
57.3
55.6
55.9
10
54.2
52.5
52.9
52.7
52.9
52.9
Averages
54.7
54.9
54.7
54.9
54.6
54.7
Var. of estimate
7.9
5.3
4.1
3.5
1.2
1.22
S.D. of estimate
2.8
2.3
2.0
1.9
1.09
1.10
Upper Control Network
1
54.6
54.2
54.3
54.4
54.6
54.5
2
55.5
54.8
55.0
57.5
56.1
56.5
3
54.7
52.7
52.9
55.5
54.6
54.8
4
56.1
54.4
54.7
55.2
53.8
53.8
5
51.1
55.9
54.6
51.9
54.4
53.8
6
50.5
52.3
51.9
53.3
54.4
54.1
7
58.7
53.1
54.0
55.1
53.6
53.7
8
52.8
56.1
55.4
56.2
57.2
56.8
9
58.9
56.7
57.3
57.3
56.3
56.4
10
54.2
54.1
54.2
52.7
54.8
54.4
Averages
54.7
54.4
54.4
54.9
55.0
54.9
Var. of estimate
7.9
2.2
2.1
3.5
1.4
1.5
S.D. of estimate
2.8
1.5
1.4
1.9
1.2
1.2
SIMPLE STOCHASTIC NETWORKS
455
and it is obvious that the covariance term must be estimated from data; see [4, p. 67]. In the experi-
ments, a /3-estimate was formed for each sample of 25 or 50 and used to create the Control-Regression
and Regression Adjusted Antithetic-Control figures.
4. ANALYTICAL METHODS BASED ON EXPONENTIAL DISTRIBUTIONS
The assumption that exponential distributions govern fink times in networks often results in
considerable mathematical simplicity. Of course the link times encountered in real problems are apt
not to be of exponential character. Their densities are usually expected to exhibit a mode (peak) at
some positive time value, and perhaps to have a vaguely normal or Gaussian appearance.
Two remarks may be made:
i) If empirical observation indeed suggests a non-exponential form of the type described above,
combinations (e.g., convolutions) of exponentials may be used as approximations.
ii) Certain types of operations may be expected, theoretically, to give rise to near-exponential
distributions. These are of a quasi-Sisyphean character: link traversal terminates when a certain task
is successfully completed, but if success does not occur the task must be repeated; success occurs on
any attempt with small independent probability, p. Certain computer program debugging, or other
trouble shooting, processes may be of this nature. Construction completion times might have a similar
character if weather conditions intervene and necessitate repetitions. In another context, the time re-
quired to pass through a part of a transportation route on which one may encounter randomly placed
congestion points, each of which offers independently and similarly distributed delay. If the number
of congestion points has a geometric distribution of a reasonably large mean, the time required has
exponential properties. Thus the exponential may be worth consideration, and we shall use it as a basis
for discussion here.
These simple properties of independently and exponentially distributed random variables are
well known or easily derived: (a) that sums of identically distributed exponential variables are gamma
distributed, (b) that sums of differently distributed exponential variables are a linear (nonconvex)
combination of exponentials, and (c) that the maximum of n exponential random variables is distributed
as a product of the respective distributions. Properties (a) and (b) bear on the modelling of simple series
networks, while (c) relates to finks in parallel. Sometimes it is possible to build up complex network
distributions from simpler ones as follows.
Figure 9.
The dotted path, in Figure 9, represents a complex network with Tv as completion time, while T L is
a single exponential path in parallel. Then
(4.1) P{Tab ^ x) =(l-e"'«) U(x),
where U is the d.f. of T u . If we now introduce the Laplace transform E[e~ sT --m] we find that
456
J. M. BURT, Jr.. D. P. GAVER, AND M. PERLAS
'[e- sT AB] = ('
Jo
(4.2)
e- sx (\-e-> lx )dU(x)+ e- sx U{x)e-' J - ,c ixdx.
_±L
= U(s)-U(s+^)+ j^U(s + ijl),
U (s) being the Laplace-Stieltjes transform of U. This procedure can obviously be repeated to generate
further exponential parallel links, with the last transform generated serving as U(s) at the next stage.
Consider
SPECIAL CASE A:
(4.3)
Then immediately
(4.4)
U(x) = \-e~ kx x^O
= x < 0.
E[e- sT AH] =
A | £*_
k + S k + fJL + S S + /X k + fJL+S
k /A k + fX
k + S /X + S k + /JL + S~
If k = (x the above becomes
«'^-(l&j)(
k + s
Thus T ah is the sum of an exponential with mean (2X) - ' and another with mean A -1 ; this follows directly
from the Markovian (memoryless) property of the exponential density.
A useful generalization of the above situation occurs when 77 has the distribution
(4.5)
P{T,.^x}=l-^ C je-»j x .
Then basic transform properties provide that
(4.6) ]E[e-TAB] = jj
l-]£ Cje-W
U(x)dx= U(s) -^ cjU(s + fij):
here the Laplace transform of the distribution U is
(4.7)
v(s)= r e~ sx u(x)dx=- r
Jo 5 Jo
e-'*dU(x)=-U(s).
s
Consequently,
(4.8)
E[e- sT AH] = U(s) -s f — CJ — U(s + fjLj)
SIMPLE STOCHASTIC NETWORKS
457
SPECIAL CASE B: Suppose 77. is distributed as a convex combination (probability mixture)
of different exponentials; i.e., with probability c } the lower link time is exponential with parameter
fXj. Transform (4.8) then represent this situation. Repetition allows many parallel paths to be added.
SPECIAL CASE C: Now take 77 to be distributed as a sum of two dissimilar exponentials:
(4.9)
Ele-fq =- J ~ ■
fJ-2
fJL\+ S /JL> + S
U2
Ml
Mi
fl2
(X> — /Mi fJLi+S /JL> — Ml /U-2 + 5
Thus 77. has the distribution (4.5) with Ci = /.iii/xi — fit)~\ and c 2 = — Mi(m^ — M-i) '• More dissimilar
components may be handled by applying a partial fractions decomposition to the transform of 77.
SPECIAL CASE D: Consider the problem of Example C, but put fx> = /m + 8, later allowing 8— » 0.
77. thus has a gamma distribution. We find that
(4.10)
and when 8— >
(4.11)
E[e-* T *«] = U{s) -j— £/(s + Ml )+f
5 ~r fJL\ O
Ujs + fii + S) U(s + fii)
s + fjii + 8 s + m-i
E[e- sT AH] = U(s) -f— Uis + ^+s-jz
s + fA\ at,
uu)
e-8 + m
(4.12)
The formula (4.11) can be derived directly. Suppose 77. has the gamma distribution
(ax) k ~ l
P{T L ^x}=l-e-°*(l+^+^+
(k-l)l
Then by standard transform properties
(4.13) -E[e~* T AB ]= f"
5 Jo
or
(4.14)
ax
A = 1 (— a) j d.J ~
j =
/! # J
£=s + a
'>•-• (—a)-' JJ
£[ e -^]=f/( 5 )- 5 2 i -r^^7
(a*)*-
1 — p- ax \ 1 + - h
1! (A-DI/J
U{x)dx.
J =
U(0
the zero-order derivative is taken to be the function itself. Clearly (4.11) and (4.14) agree when k = 2
and a = fii.
SPECIAL CASE E: Refer to Example 3 of the privious section; the subnetwork of Figure 7 may
be studied directly by means of (4.14). First, put
(4.15)
U(s) = E[e- s ^ T < ) ]-
t*
/JL + S
458
J. M. BURT, Jr., D. P. GAVER, AND M. PERLAS
Then according to (4.14) the transform associated with the completion time of the parallel network
involving link times Tz, T3, T 5 , and T 6 is
E[e~ sT ]-
(4.16)
- t x
M
fi y s 1 fx
\ d
A*
d€l?\n+€
l = M + s
P- + S/ 5 + IX\2/X + S
— 5
^ +
2/Ll 3
(ix + s) 2 (2fx + s) 2 (fJL + S )(2fX + s) 3
It is now only necessary to multiply this expression by the transform of the sum of links involving T\
and r 4 , namely I — ^7— ) , in order to obtain E[e~ sT AB] for the subnetwork of Figure 7. The result may
be simplified to
(4.17)
/x + s
E[e- sT An] = 2
M
(X + S
M
IX
fJL + S/ \2/JL + S
+ 2
M
^
/i + s/ \2fi + s)
Differentiation then produces the mean, variance, and other desired moments. Although it is appar-
ently possible to invert transforms mathematically the results will typically appear as incomprehensible
combinations of exponentials. We favor the numerical inversion of such transforms, using perhaps the
method of Abate and Dubner [1]. For given link time distributions one can thus quickly develop a
table of the distribution function of Tab for the approximating network.
REFERENCES
[1] Abate, J. and H. Dubner, "Numerical Inversion of Laplace Transforms by Relating Them to the
Finite Fourier Transform," J. of the ACM, Vol. 15, No. 1 (Jan. 1968).
[2] Charnes, A., W. W. Cooper, and G. L. Thompson, "Critical Path Analyses Via Chance Con-
strained and Stochastic Programming," Operations Research 12, 460-470 (1964).
[3] Feller, W., An Introduction to Probability Theory and Its Applications (John Wiley and Sons,
New York, 1966), Vol. II.
[4] Hammersley, J. and D. C. Handscomb, Monte Carlo Methods (Methuen and Co., Ltd., London,
1967).
[5] Hartley, H. O. and A. W. Wortham, "A Statistical Theory for PERT Critical Path Analysis,"
Management Science, 12, 469-481 (June 1966).
[6] Jewell, W., "Risk Taking in Critical Path Analysis," Management Science, 11, 438-443 (Jan.
1965).
[7] Kelley, J. E., "Critical Path Planning and Scheduling; Mathematical Basis," Operations Re-
search, 9, 296-320 (1961).
[8] MacCrimmon, K. R. and C. A. Ryavec, "An Analytical Study of the PERT Assumption," Opera-
tions Research, 12, 16-37 (1964).
[9] Martin, J. J., "Distribution of the Time Through a Directed Acyclic Network," Operations
Research, 13, 46-66 (1965).
[10] Moder, J. and C. Phillips, Project Management with CPM and PERT (Reinhold Pub. Co., New
York, 1964).
SIMPLE STOCHASTIC NETWORKS 459
[11] Van Slyke, R. M., "Monte Carlo Methods and the PERT Problem," Operations Research, 11,
839-860 (1963).
[12] Wiest, J. D., "Some Properties of Schedules for Large Projects with Limited Resources," Opera-
tions Research, 12, 395-418 (1964).
A NETWORK ISOLATION ALGORITHM*
M. Bellmore
The Johns Hopkins University
G. Bennington
North Carolina State University^
and
S. Lubore
The MITRE Corporation
ABSTRACT
A set of edges D called an isolation set, is said to isolate a set of nodes R from an un-
directed network if every chain between the nodes in R contains at least one edge from the
set D. Associated with each edge of the network is a positive cost. The isolation problem is
concerned with finding an isolation set such that the sum of its edge costs is a minimum.
This paper formulates the problem of determining the minimal cost isolation as a 0-1
integer linear programming problem. An algorithm is presented which applies a branch and
bound enumerative scheme to a decomposed linear program whose dual subproblems are
minimal cost network flow problems. Computational results are given.
The problem is also formulated as a special quadratic assignment problem and an algo-
rithm is presented that finds a local optimal solution. This local solution is used for an initial
bound.
INTRODUCTION
Given an undirected connected network G = [N; E] with r sets of distinguished nodes, Rj C TV,
for 7= 1, 2, . . ., r; a set of edges D C E, called an isolation set (or isolation) has the property that the
removal of the edges in D partitions the graph into p 5= r connected components G\, G-z, . . ., G,,.
In addition, for the set D to be an isolation with respect to the r distinguished node sets, the p 3= r
connected components must be capable of being grouped into exactly r sets, Pj for j= 1, . . . ,r, such
that each set Pj contains all of the distinguished nodes in Rj. Associated with each edge (x, y)eE of
the graph is a positive cost of removal, c(x, y) > 0. A minimal cost isolation, Do, is an isolation set,
the sum of whose edge costs is a minimum.
One example of an isolation problem is that faced by an antagonist who desires to attack a com-
munications network in a manner which prevents specific sets of nodes Rj from communicating with
other sets of nodes /?,-, i ¥>j. If a cost is prescribed for the attack of each fink in the network (it is as-
sumed that the cost of attacking nodes is prohibitively high), then the attacker wishes to find a minimal
*This paper was presented at the 35th National ORSA meeting in Denver, Colorado, on June 17, 1969.
tThis work was performed while the author was at The MITRE Corporation.
461
462
M. BELLMORE, G. BENNINGTON, AND S. LUBORE
cost isolation. Problems of a similar type have been investigated and solved by Greenberg [5] and
Jarvis [6] in a different fashion than the algorithm presented in this paper.
A variant of the backboard wiring problem [9] can also be formulated as an isolation problem:
consider an existing system composed of subsystems. New components are to be added to the existing
system. Each component to be added requires additional wires to be connected to each subsystem
and to each other component. Let c(x, y) be the number of wires to be added between x and y. Assum-
ing sufficient space in each subsystem to accommodate any or all of the additional components, then
one desires to place the new components in the existing subsystems in a manner that minimizes the
number of additional wires between the subsystems. The set Rj corresponds to the 7th subsystem.
Bennington [1] has related the problem of determining the minimal multicommodity disconnecting
set in an undirected graph to the isolation problem.
Example:
Figure 1 shows an undirected graph associated with a simple example of the isolation problem.
o
C(l,4): I
<3,4)=l
G>
C(2,5) = 2
Figure 1. Example of the isolation problem
It is desired to isolate nodes 1, 2, and 3 from each other. The number adjacent to the edges in the graph
is the cost to remove the edge. The minimal cost isolation has a value of 3 corresponding to removal
of the edges
{(1,4), (2,5)} or {(1,4), (3,4), (3,5)}-
THE ISOLATION PROBLEM AS A 0-1 INTEGER LINEAR PROGRAM
Suppose that an isolation has been found for the network G= [N; E] then each node xeN is con-
tained in exactly one of the partitions Pj, j= 1, . . ., r. In addition each Rj C Pj,
Let
1, xePj,
^ { * ) = lo', xePt, l t):
Each undirected edge of the network is assigned an arbitrary orientation, that is, each edge is
defined by one of the ordered pairs (x, y) or (y, x) corresponding to the endpoints (nodes) of the
edge. The isolation problem is stated as:
Minimize
subject to
(1)
Z= Y c(x,y)\y j [y j (x,y) + d j (x,y)]
(x7y)eE \j=l
y ttj(x)=1, xeN-URj,
NETWORTH ISOLATION ALGORITHM 453
1 , xeRj ,
(2) irj(*) =
[ 0, xeRi, i ¥=j,
j=l, . . ., r,
(3) 7T j (x)-TT J {y) + y j (x, y)-8j(x, y) = 0, (x,y)eE,
(4) TTjix), Jj(x,y), 8j(x, y) are 0—1 variables.
Equations (2) fix each of the distinguished nodes xeRj into a single subset Pj.
If nodes x and y are in the same set, say P s , corresponding to a feasible isolation, then v s {x)
— TTs(y) = 1 and from Eqs. (1) and (2),ttj(x) —ifj(y) = 0,j^ s. Further, jj(x, y) — 8j(x,y) =0,j=l, . . .,
r is feasible to Eqs. (3) and since, c(x, y) > 0, minimizes Z for this isolation.
If nodes x and y are in different sets, say P s and P t , respectively, corresponding to a feasible isola-
tion, then TT s (x) = TT t (y)= 1; and from Eqs. (1) and (2) ttj(x) — 0, j ^ s; and 77j(y) = 0, j¥* t. Further, in
order to satisfy Eqs. (3) y s (*, y) — 0, 8 s {x, y)=l; and jt(x, y)= 1, 8 t (x, y) = 0. Since c(x, y) > 0,
jj{x, y) = 8j(x, y) = 0,j¥=s, t.
From the above arguments, an edge (x, y) is contained in an isolation if either jj(x, y)=l or
8j(x, y) = l for some index j. Further, since for each 8 s (x, y)= 1 there exists a 8 t (x, y)= 1 and both
contribute to the value of the objective function, 1/2 Z is the value of a minimal cost isolation.
The proposed algorithm solves the isolation problem using the Land and Doig [7] branch and
bound enumerative scheme with restrictions (4) replaced by nonnegativity constraints (i.e., as a series
of linear programs). Each of the linear programs is decomposed using the Dantzig-Wolfe technique
[3]. Equations (1) which link the r subproblems together correspond to the master problem.
Theyth subproblem is:
Maximize W= ^ <t(x)ttj{x)
- ^ c(x, y)[jj{x, y) + 8j(x, y)] ,
( x, y)(E
subject to:
(2)
(2A)
(3) TTj{x)-TTj(y) + jj{x, y)-8j(x, y)=0, {x,y)eE,
(4A) ttj(x), yAx,y), 8j(x,y)>-0,
where (t{x) is the simplex multiplier corresponding to each constraint (1) in the master problem for
the free nodes (i.e., x^Rj, i=l, 2, . . ., r) , and cr(x) = for the fixed nodes (i.e.,
xe
u Ri)
Constraints (2A) have been included in order to insure that the subproblems are bounded. Unless
the subproblem is bounded, no feasible dual solution can be found. Since the solution technique to
464 M. BELLMORE, G. BENNINGTON, AND S. LUBORE
be employed requires the dual solution to these subproblems, this restriction is important.
An alternative formulation of the 7th subproblem that is easier to deal with is given by (the sub-
script j has been dropped from all of the variables to alleviate the notational burden):
Minimize W —— ^ (t(x)tt(x) + ^ c(x,y) [y(x, y) + d(x, y) ] ,
subject to
f(x, y): tt{x) — 7r(y) +y(x, y) — 8(x, y) = 0, (x,y)eE,
r =-l,xeRj,
p(x): -tt{x) {= 0, xeRi, i^j,
l,xeN-U Ri,
tt(x), y(x, y), 8(x, y) 3=0.
The simplex multipliers associated with the dual of this problem are given to the left of their cor-
responding equations.
For notational convenience, let f{x, W) = V f{x, y) and let f(N, x) = V f{y, x) .
yeN ytN
The dual of this problem is
Maximize T=^ b(x)p(x),
xtN
,,'tfxeRi, i^j,
where (r
otherwise,
subject to
tt(x): f(x,N)-f(N,x)-p(x)^-a-(x), xtN,
y(x,y): f(x, y) ^ c(x, y),)
8(x,y): —f(x,y)^c(x,y),\
f(x, y) unrestricted, (x, y)eE,
p(x)
0, if*e7V- U R h
1 = 1
unrestricted, if xe U /?,-.
As before, the dual variables are given to the left of the constraint equations. This problem resembles
a single commodity flow problem with the exception of the/(;t, N) —f(N,x) —p(x) =£ — a(x) equations
and the p(x) variables. This problem can be transformed into a more recognizable form. Upon adding
the nonnegative slack variables, q{x) , to the/(x, N) —f(N, x) — p(x) =£ — cr(x) equations, the problem
becomes
(5) Maximize T= ^ b(x)p(x),
xtN
subject to
NETWORTH ISOLATION ALGORITHM
465
(6)
(7)
(8)
(9)
f(x, N) -f(N, x) -p(x) +q(x)+ o-(x) = 0, xeN,
— c(x, y) =£/(*, y) «=c(x, y), (x,y)eE,
^0, if xeN- U Ri,
P(x)l '~ l r
unrestricted, if xe U /?,-,
i= i
q(x) >0, xeN.
Consider the redundant constraint formed by summing the negative of Eq. (6):
-^ f(x, N) +Z f(N, x) + "£ p(x) -^ q(x) -^ a(x) =0.
X€N
xc\
Noting that ^f(x, N) = V f(N, x), the redundant constraint can be simplified to
x*N xeN
(10) X [p(x)-q(x)-a(x)]=0.
XfN
The constraint matrix formed by including this additional constraint has exactly one + 1 and one
— 1 in each column. Therefore, this problem has a representation as a network flow problem. The
directed network corresponding to the dual linear program of the jth subproblem will be denoted by
G= [N 1 ; A']. The set of nodes N' is formed from the original set N by adding a dummy node 5'. The
set of directed arcs A' is formed from the original set of edges E by replacing each edge with two di-
rected and oppositely oriented arcs and by adding additional arcs to correspond to the variables q(x),
p{x), and constant cr(x). Since the nonnegative variable q(x) has a coefficient of + 1 in Eq. (6) and a
coefficient of — 1 in Eq. (10), an arc (x, s') is added to A' with an arc flow variable q(x, s') to replace
q(x). In a similar manner, cr(x) is replaced by a constant arc flow variable cr(x, s') or or (5', x), de-
pending on the sign of the simplex multiplier. The variable p(x) is replaced by an arc flow variable
p (s' , x) if p (x) 2= and by the arc flow variables p (s' , x) and p (x, s' ) if p (x) is unrestricted in sign.
The characteristics of the arcs in A' are summarized in Table I.
TABLE I. Maximum Cost Flow Problem (Subproblem j)
Arc
Lower
bound
Upper
bound
Cost
Comments
/(*. y)
-c(x, y)
c(x, y)
Replace with two
directed arcs
<r(x, s' )
<t(s', x)
o-(x)
-o-U)
a(x)
-a-(x)
Ifo-U) &0
lfo-(x)<0
q{x, s')
00
p(s\ x)
00
00
-1
otherwise
p(x, s')
00
00
1
xeHj
.ve/\\, / 9&J
466
M. BELLMORE, G. BENNINGTON, AND S. LUBORE
Figure 2 contains a portion of a graph that will be used to illustrate the form of the subproblems.
c(l,2)=l
C(l,3)=l
c(2,3) = 3
FIGURE 2. Subproblem network
Assume that in a subproblem 7r(l) «£ 1, 7r(2)=0, and 7r(3) = l; and the simplex multipliers are
cri = l, cr 2 = 2, and 0-3 = — 1. The network G= [N';A'] corresponding to this subproblem is shown in
Figure 3. Note that for this particular example some of the parallel arcs can be coalesced with others
or deleted completely.
P(*,i)
(0,«>,. I)
f(3,l)
(0,1,0)
q(3,s)
(O.oo.O)
The first term adjacent to each arc denotes the type of flow variable and the second
term is the lower bound, upper bound, and cost of the flow on that arc.
Figure 3. Modified graph for subproblem
NETWORTH ISOLATION ALGORITHM 467
Fulkerson's Out-of-Kilter Algorithm [4] may be used to solve the subproblems. Although the
Out-of-Kilter Algorithm is normally used to solve minimal cost flow problems, in order to apply the
Fulkerson Algorithm the flow problem (primal linear program) must maximize the negative of the
costs. Hence, since the dual variables obtained from the Out-of-Kilter Algorithm are required to solve
the /th subproblem, the flow problem has been cast as one of determining a maximal cost flow solution.
In this manner the columns to enter the basis of the master problem are iteratively generated until
an optimum solution is obtained.
A SPECIAL QUADRATIC ASSIGNMENT PROBLEM
The isolation problem can also be formulated as a special case of the quadratic assignment problem.
(See Lawler [8] for a discussion of the general quadratic assignment problem.)
Minimize Z=l/2j £ £ c(x, y)7Tj(x)[l -TTj(y)],
j=l xeN yeN
subject to
J ttj(x) = 1, xeN-U Ri,
i=l
1, xeRj,
[0, xeRt, i^j,
ttj(x)=0, 1, xeN, t
7=1,- • -,r.
Since
XSS c(x,y)7T j (x)[l-7r j (y)] = ^ £ c(*.y)"2 £ £ c(x, y)nj{x)nj(y),
j=l xeN yeN xeN ytN j=\ xeN yeN
and ^ ^ c(a;, y) is a constant, the isolation problem can also be stated as follows:
xeN yeN
Maximize ^=1/2^^ c(x, y)TTj(x)7Tj(y) ,
j = 1 xeN yeN
subject to
£ 7Tj(x) = l, xeN- u Ri,
1, xeRj, )
TTj{x)
10, xeRi, i ¥= jA j=l,
TTj{x) =0, 1, xeN, j
AN ALGORITHM TO GENERATE LOCAL OPTIMA
A modified version of an algorithm of Carlson and Nemhauser [2] is presented which generates a
local minimum to the isolation problem. No proof of this algorithm is given since it is so similar to that
of Carlson and Nemhauser.
STEP 0: Define a feasible partition, Pj,j=l, . . ., r, of the nodes xeN.
STEP 1: Select a free node, xjRj, j= 1, . . ., r, which has not been examined since the last
change in the partition.
468
M. BELLMORE, G. BENNINGTON, AND S. LUBORE
STEP 2: Let s denote the index of the set in which x is currently assigned (i.e., xeP s )
Calculate
dj(*)=2 c(x,y)TTj(y)J=l, . . ., r.
ycN
Note that the movement of node x from set s to set t will result in a change in the isolation cost of
d s (x) — d t {x).
STEP 3: Calculate A = max {dj(x)}.
j
STEP 4: If A ^ d s (x) , then there exists a set P t such that d s (x)—d t (x) <0 and, hence, a solution
with a smaller isolation cost may be found. Move node x into set t and go to Step 1.
If A = d s (x) , and a free node has not been examined since the last change in the partition, go to
Step 1.
If A. = d s (x), and all free nodes have been examined since the last change in the partition, termi-
nate; a local minimum has been found.
COMPUTATIONAL RESULTS
A computer program for the isolation problem has been written in FORTRAN IV for the IBM
360/50 computer. The decomposed problems were solved utilizing the revised simplex method with
product form of the inverse together with the branch and bound algorithm. The subproblems were
solved utilizing the Out-of-Kilter algorithm as previously described. The computational results for a
set of representative problems are given in Table II.
Table II. Computational Results for the Isolation Problem
Problem
No.
No. of
nodes
No. of
edges
No. of
sets
No. of IBM 360/50
edges execution
in Do time (sec.)
1
12
18
3
6 20.7
2
15
30
3
12 206.6
3
15
50
3
13 725.3
4
15
50
4
17 588.4
5
15
50
5
21 401.4
6
15
50
6
24 325.3
The algorithm for generating local minima was also programmed in FORTRAN IV, all problems
solved to date have been completed in less than 15 sec.
ACKNOWLEDGMENT
The authors wish to thank the MITRE Corporation, under whose auspices this work was performed,
and Professor George Nemhauser of Cornell University for his many helpful suggestions and comments.
NETWORTH ISOLATION ALGORITHM 469
REFERENCES
[1] Bennington, G. E., "Multi-Commodity Disconnecting Sets in Undirected Graphs by Node Partition-
ing," Ph. D. Thesis, The Johns Hopkins University, Baltimore, Maryland (1969).
[2] Carlson, R. C. and G. L. Nemhauser, "Scheduling to Minimize Interaction Cost," Operations
Research, 14, 52-58 (1966).
[3] Dantzig, G. B. and P. Wolfe, "The Decomposition Algorithm for Linear Programs," Econometrica,
29, 767-778 (1961).
[4] Fulkerson, D. R., "An Out-of-Kilter Method for Minimal Cost Flow Problems," J. Soc. Ind. and
Appl Math., 9, 18-27 (1961).
[5] Greenberg, Harvey, "Optimal Attack of a Command and Control Network," Ph. D. Thesis, The
Johns Hopkins University, Baltimore, Maryland (1968).
[6] Jarvis, John J., "Optimal Attack and Defense of a Command and Control Communications Net-
work," Ph. D. Thesis, The Johns Hopkins University, Baltimore, Maryland (1968).
[7] Land, A. H. and A. G. Doig, "An Automatic Method of Solving Discrete Programming Problems,"
Econometrica, 28, 497-520 (1960).
[8] Lawler, E. L., "The Quadratic Assignment Problem," Management Science, 9, 586-599 (1963).
[9] Steinberg, L., "The Backboard Wiring Problem: A Placement Algorithm," J. Soc. Ind. and Appl.
Math. Revue, 3, 37-50 (1961).
RESOURCE ALLOCATION FOR TRANSPORTATION*
G. Bennington
North Carolina State University^
and
S. Lubore
The MITRE Corporation
ABSTRACT
A linear programming formulation is described that will permit the optimal assignment
of transportation resources (vehicles) and movement requirements to a network consisting
of a set of designated origins, ports of embarkation, enroute bases, ports of debarkation,
and destinations to achieve a prescribed schedule of arrivals.
INTRODUCTION
This paper describes the Resource Allocation For Transportation (RAFT) linear programming
model, which allocates transportation resources and movement requirements to a set of routes for the
optimal achievement of a prescribed schedule of arrivals.
Several linear programming solutions of ship routing and scheduling problems have been pre-
sented. Laderman, Gleiberman, and Egan [3] have investigated two problems of scheduling iron
ore shipping on the Great Lakes. Fitzpatrick, Bracken, O'Brien, Wentling, and Whiton [2, 7] have
investigated problems of determining the least cost mix of vehicles (air and sea) to satisfy time phased
military shipping requirements.
Rao and Zionts [5] give an algorithm to determine a set of cyclic ship routings for a nonhomogeneous
fleet of ships that will minimize the total shipping costs. Their approach decomposes the linear program
into a set of subproblems, each of which is a network flow problem. Dantzig, Blattner, and Rao [1]
attack this problem in a different fashion in order to determine a set of vehicle routings that have
minimum cost to time ratio. This determination is made by solving a series of shortest path problems
in a network that represents all possible cyclic shipping schedules.
Schwartz [6] presents a linear integer programming formulation of a problem dealing with the
movement of known cargos from origin to destination on time at minimum fleet cost. Lewis, Rosholdt,
and Wilkinson [4] give a comprehensive formulation of a problem dealing with strategic mobility, which
they then treat as a single commodity flow problem through a time expanded network.
*The work reported in this document was sponsored by the Defense Communications Agency under contract
F19628-68-C-0365.
tThis work was performed while the author was at the MITRE Corporation.
471
472 G - BENNINGTON AND S. LUBORE
The RAFT model differs from existing strategic mobility planning LP models in the following
respects:
(a) It permits the optimum allocation of both transportation resources and movement requirements
to available routes from origin to destination.
(b) It allows for the transshipment of movement requirements subject to constant delays at enroute
nodes and transshipment points. (Queues are not allowed at enroute nodes or transshipment points.)
PROBLEM DESCRIPTION
The problem addressed by this paper is one of moving military forces comprised of personnel and
materiel from origin to destination over a network consisting of a set of ports (nodes) and vehicle
oriented routes in order to achieve a desired schedule of arrivals.
For the purposes of this paper, each item to be moved will be called a movement requirement (or
requirement). Associated with each movement requirement is a unique origin and destination, together
with a date when the requirement becomes ready to be moved from its origin and a preferred delivery
date at its destination. Further, each such requirement is restricted to a given quantity of a particular
commodity.
The set of transportation resources is composed of the inventory of military aircraft and ships
as well as commercially available land, sea and air vehicles that may be chartered for military use.
The transportation resources are partitioned into sets of constrained and unconstrained vehicles. Time-
varying vehicle characteristics are considered as well as the availabilities of vehicles as a function of
time. Additionally, vehicle characteristics may vary for each of the routes in the network.
For each origin-destination pair in the network, there are defined one or more directed paths
consisting of sequences of nodes and arcs. The nodes correspond to the physical points in the network
and the arcs correspond to vehicle movements between points. Each node in the network may be con-
strained by the maximum vehicle handling throughput capacity for each time period considered in
the planning period for the movement. Each of these directed paths is designated as a vehicle oriented
route from an origin to a destination.
In particular, each route may have any number of enroute stops for vehicles using the route, and
any number of transshipment points at which the requirements being moved are transferred from one
vehicle type to another.
Figure 1 depicts a typical vehicle oriented route. A requirement shipped over this route travels
from port A (origin) by train to airport B (transshipment point). At airport B, it is transferred to an air-
craft for shipment by way of airport C (enroute stop) to airport D (transshipment point). At airport D,
the requirement is transferred to a truck for eventual delivery to area E (destination). At each enroute
node appropriate transit delay times (loading times, unloading times, and stopover times) are assessed
in the determination of vehicle use and the times at which the requirement reaches each of the nodes.
FORT A AIRPORT B AIRPORT C AIRPORT AREA E
-*• ^TTTTr •"• T^TZ •"•
PLANE PLANE TRUCK
Figure 1. A typical route
If a particular movement is to be feasible, there must be residual capacity at each node in the
network at the times corresponding to vehicle departures from these nodes.* Additionally, sufficient
'Destination nodes are constrained by vehicle arrivals.
RESOURCE ALLOCATION FOR TRANSPORTATION 473
vehicles must be available at the appropriate times to move the requirements over the several routes.
No queueing is allowed at intermediate nodes of a route; however, fixed delays may be assessed at
each node.
MATHEMATICAL FORMULATION
NOTATION
Movement Requirements
The set I represents the set of movement requirements. For each requirement iel, we denote its
origin by s, and its destination by t,. Each requirement consists of a nonnegative quantity F, of per-
sonnel or materiel to be moved.
Time Periods
J will denote the set of time periods allowed for a movement. Specific subsets of J will be defined
later in the discussion of the constraint equations.
Nodes
The set P represents the nodes in the network. Each origin, transshipment point, enroute node,
and destination is an element of the set P.
Transportation Resources
The set K is defined as the set of vehicles to be considered for a particular movement. Each vehicle
keK is an element of the subset of constrained vehicles K\ C K or the subset of unconstrained vehicles
K> C K. It is assumed that the subsets K\ and K> are mutually exclusive and collectively exhaustive.
Routes
The set N represents the set of all vehicle oriented routes in the network and the nth route is
represented by the sequence
[pf, (&?),p 2 n , . . .,p», (£«), . . ., (AgJj.pg],
where
p q 'eP = the qth node in the nth vehicle oriented route
and
(kq)eK= the arc representing the type of vehicle moving from p£ to p, +1 .
A route segment is a portion of a route
[p», (*J), . . ., (AjLj.pjJ]
that employs the same vehicle, i.e.,
"■q "-9+1 • • • "-r-l '
where
Pq and p"= transshipment points, an origin or a destination.
474 G. BENNINGTON AND S. LUBORE
A route, therefore, consists of a series of route segments corresponding to the sequence of vehicles
employed on the route.
For convenience in defining the constraint equations in the later sections, specific subsets of the
set of routes will be defined.
The set of routes over which requirement i may move is denoted as
N^ C N={neN | Si = p», ti = pfi ,
where
5; = origin for requirement i, and
U— its destination.
The set of routes incident with node p is denoted as
/V (2) C N={n€N\P'q=P}-
The set of routes over which vehicle k may travel is denoted as
N(3)q yV={ ne /V| (k^) = k}.
Decision Variables
The set X represents the set of decision variables in the linear program. That portion of the ith
movement requirement which leaves its origin in the yth time period traveling over the nth route is
denoted as xi,j, „■ A decision variable Xi,j, n *X is defined for each requirement for each feasible shipping
time period for each valid route.
CONSTRAINT EQUATIONS
The set of feasible solutions is defined by the following set of linear inequalities:
Requirement Constraints
y Di, fj, k xv, j, n > Fi, iel
(1) «*i"
where
X^ = {xv, jt neX\n€N^', i'el},
i
Di, i',j,fc= the amount of requirement i accompanying a unit of requirement V on vehicle k if shipped
in time period j (note that £>,, ,,j, a-= 1), and
Fi— the total amount of requirement i to be moved.
These constraints assure that at least enough of requirement i will be delivered, any overshipment
represents residual vehicle capability. If accompanying cargo is not allowed, these constraints are
equalities.
Node Constraints
The nodes in the network may be constrained to reflect their vehicle handling capability and/or
to reflect the outloading, throughput, or reception capability.
RESOURCE ALLOCATION FOR TRANSPORTATION 475
If C h „ is the maximum number of vehicle departures (vehicle arrivals for a destination) the con-
straint equation is:
£ f^ ^ C j>p , jej, P eP
(2) **S ViJ ' k -"
The set X { r } lt is defined as:
X^ixur.neXlneNpnNM-J'cJffl,
where
y ( . 2) = the set of time periods for each of the shipments x,,/,,, which leave (arrive at a destina-
nation) node p in time period j.
Vi,j',ie,n — the capacity of the A:th type vehicle to carry the ith requirement in time period/ over the
nth route.
If C'i P represents a node outloading, throughput, or reception capacity the constraint is
^ Ai,j,nXi,j', n ^ C'j, p , jeJ,peP,
xeX"
J.I'
where
Aij,„ = reflects the per unit consumption in thejth time period of node capacity corresponding to
shipment x,,/,„.
Vehicle Constraints
(3) X TT^ * M i>«< &> keK >
j.k
The set X { ?\. is defined as
X$= {xi,j>, n eX\neNMnWV; fejffl,
where
J$\ — the set of time periods for each of the shipments Xi,y,„ which utilize the &th vehicle type
in they'th time period.
Bi,j,k,n = the total carrying capability of each vehicle of the kth type when employed during the
yth time period on the nth route carrying the ith requirement. In the computation of
Bij.k.n, the average utilization rate of a vehicle can be taken into consideration.
Mj,k = the number of vehicles of type k available for use in time period j.
OBJECTIVE FUNCTION
Since the movement requirements have preferred delivery dates, the objective function must
consider both the allocation of scarce resources and the scheduling of arrivals of the requirements at
their destinations.
Hence, in order to deliver the requirements as close as possible to the preferred delivery dates,
the objective function minimizes the total weighted deviations about the preferred delivery dates.
476
The objective function is:
G. BENNINGTON AND S. LUBORE
(4)
minimize ^T Wi,j,nXi,j,ji:
xtX
fhere
Wij,„ = a weighting factor.
Weighting; Factors
The weighting factors are functions of the deviation of the arrival time period of the ith requirement
at its destination, if shipped in the jth time period over the rath route, from the preferred delivery date
of the ith requirement. In general, each Wjj, „ may be an arbitrary function of the deviations consistent
with the subjective utilities of early or late deliveries as well as any routing or vehicle preference.
Figure 2 illustrates possible utility functions to perform the scheduling. Figure 2(A) depicts a situ-
ation where early arrivals are always preferred over late arrivals, and the weights vary linearly with
respect to the deviations. Figures 2(B) and 2(C) correspond to equal trade-offs between early and late
deviations; Figure 2(B) minimizes the absolute value of the deviations, while Figure 2(C) minimizes
the square of the deviations. Other weighting factors are easily envisioned and the particular choice
should be made consistent with the specific movement to be analyzed.
PREFERRED DATE
PREFERRED DATE
PREFERRED DATE
FIGURE 2. Possible weighting functions to minimize deviations
In the analysis used to date, the weighting function defined in Figure 2(A) has been used. Point B
is the preferred delivery date for some requirement and is given a weighting factor value of one. Any
RESOURCE ALLOCATION FOR TRANSPORTATION
477
deviation from point B has a weighting factor greater than one. The size of the weighting factor is con-
sonant with the particular region of the curve within which an arrival occurs.
Regions of Weighting Factor Curve
Early Infeasible Region. This region covers the periods of time that the requirement is either not
ready to be moved or cannot possibly arrive. In a particular movement, arrivals that are more than a
fixed number of time periods prior to the preferred delivery date could be inadmissible. In this region,
a weight would not be assigned, rather the decision variable would be suppressed.
Early Region. This region is represented by the points from A to B, corresponding to the feasible
early arrival periods. This region is always preferred to an arrival in the late region. The weighting
factor is assigned a value of one at point B and is increased in steps of unity for each preceding time
period. For example, if point A corresponds to an arrival, two time periods prior to the preferred
delivery, the weighting factor assigned to shipments arriving in the time period corresponding to
point A has a value of three.
Late Region. This region, represented by the points from C to D, corresponds to a late arrival.
Point C is assigned a weight that is one greater than any other requirement's maximum early arrival
weight. The weights assigned to points beyond point C increase in value in unit increments. Hence,
in the maximum possible early weight for any requirement is 40, the first late weight would be 41. In this
manner, trade-offs between early and late arrivals are always biased towards the early end of the
scale for the requirements to be moved in a particular problem.
Late Infeasible Region. This region of the curve corresponds to deliveries that would arrive beyond
the planning period for the total movement. This region may also correspond to deliveries that are in-
admissible, since they arrive more than some fixed number of time periods after the preferred delivery
period. In this region, the decision variables would not be assigned weights, instead, they would be
suppressed.
MODEL OPTIONS
Alternative Objective Functions
Other alternative objective functions may be utilized, depending upon the particular type of
problem to be analyzed.
Minimum Time Solution. In some strategic movements, it is desirable to accomplish the overall
movement as quickly as possible, subject to the constraints described above. Figure 3 depicts a weight-
FlGURE 3. Weighting function for minimum time
478 G - BENNINGTON AND S. LUBORE
ing function that will minimize the weighted delivery periods of all requirements. If a trade-off exists
in the same arrival period between one requirement arriving early and another late, preference is given
to the requirement that would arrive after its preferred delivery period. The discontinuity between
points B and C is used to resolve such a trade-off.
Minimum Cost. Rather than minimizing the deviations from a set of preferred delivery periods, it
may be more important to minimize the total cost of the movement. In this case, the objective function
becomes:
(5) minimize V c-,, j, „x,-, j, „,
XfX
where
Cj, i /,»=the cost of shipping a unit of the ith requirement in the 7th time period over the nth route.
Maximum Flow. If the object is to maximize the total flow of commodities over the network subject
to the vehicle and node capacity constraints, the objective function becomes
(6) maximize ^ Xi.j.n-
xt.X
Secondary Optimization
In some instances, it is not only desirable to attain a minimum deviation or minimum time solution,
but also to achieve this objective at the minimum cost. This can be accomplished by first solving the
minimum time problem and then annexing the objective function to the constraint matrix to maintain
an optimal solution to the original problem. The linear cost function is then minimized.* This results
in the least cost solution that has the minimum total deviation or minimum time.
Trade-Off Bias Among Commodities
The weighting functions considered in the case of minimum deviation or minimum time solutions
have so far considered a unit of each commodity to be of equal value. Since in many cases, a single
passenger is not of equal value to a ton of cargo, it is desirable to give preference to a unit of the more
valuable commodity if competition for scarce resources exists. One way to eliminate this trade-off bias
is to multiply each weighting function value by an additional factor. Hence, if each passenger delivered
on time is considered to be as valuable as 400 pounds of cargo, each weighting factor for the passenger
requirement would be reduced by one-fifth. In this manner, every five passengers delivered are equiva-
lent to a single ton of cargo delivered. Similar factors may be introduced for other commodities.
Priorities Among; Movement Requirements
Each requirement can be given a priority to favor its shipment over other shipments in a given
time period if enough resource capacity is not available to ship all competing requirements simul-
taneously. In general, each requirement will be delivered eventually, so it is only the increments of
weights between arrival times that are important and not thier absolute values.
Consider the case of two requirements with the weighting functions shown in Figure 4. The higher
priority requirement corresponds to the line segments A' B and CD'. Because the slopes of A' B and
*There are several alternative methods of implementing a secondary optimization, but this method has yielded good results
for the authors.
RESOURCE ALLOCATION FOR TRANSPORTATION
479
CD' are twice that of AB and CD, respectively, the incremental penalty for deviating from the pre-
ferred delivery period is greater for the higher priority requirement, and, hence, the higher priority
FIGURE 4. Weighting function with two priority classes
requirement will always be given preference over the lower priority requirement. In addition, the
distances from B to C and from B' to C tend to force the late delivery of the lower priority requirement
before that of the higher priority requirement.
In the general case of many priority classes, the slopes are increased in proportion to the priorities,
and the step from B to a delivery one period late is greater than the largest possible weight for the next
lower priority. In this manner, a number of different priorities may be handled.
Movement Requirement Mode Preference
In some movements, a shipping mode preference (air or sea) is given for each movement require-
ment. Three options are available in the model with respect to modal preference.
(a) Fixed Mode — The requirement is forced to travel by the preferred mode by suppressing all of
the decision variables for which the preferred mode of the shipment does not match that of any vehicle
on the route.
(b) Preferred Mode — The objective function weights are decreased by a small quantity, say 8, if
the corresponding Xi,j, „ results in a shipment over a route that uses vehicles consistent with the pre-
ferred mode of shipment.
(c) Mode Indifference — The mode preference, if any, is ignored.
Vehicle Preference by Cargo Type
In some movements certain vehicles may be preferred for the shipment of one cargo type over
another. The model specifies an ordered preference of vehicles for each cargo type. In a manner similiar
to the treatment of a movement requirement mode preference, the objective function is biased to reflect
preferences for vehicle usage.
480
G. BENNINGTON AND S. LUBORE
Table I represents a typical vehicle-cargo preference relationship. In the table, the lower numerical
values represent the higher priorities. Vehicle 1 is preferred over all other vehicles for the shipment
of cargo type 1 and vehicle 3 is preferred for cargo type 4. The relative preference for utilization of
vehicle 2 is the shipment of cargo type 2 first, followed by cargo types 3, 1, and 4.
Table I. Vehicle-Cargo Preference
Cargo Type
1
2
3
4
1
1
6
10
10
/ehicle
Type
2
5
1
2
6
3
10
7
2
1
Increased Aircraft Utilization
The productivity of an aircraft type is reflected in the value of Bij, k, « in the vehicle constraint
equation (Eqs. (3)). The value of Bi,j,k,n is based upon the long term average utilization rate of an
aircraft over a given route. However, for short periods of time, this average utilization rate might be
exceeded as long as the long term average value is maintained. For these periods of increased utiliza-
tion, a higher aircraft productivity B{ y jy ki „ may be used.
The constraint equation, Eqs. (3), can be modified for aircraft to allow for increased utilization
over some time periods to compensate for less than average utilization in preceding time periods.
This is accomplished by defining a variable T h u corresponding to this under utilization. This variable
may then be carried forward into subsequent time periods where it serves to augment the available
vehicle productivity. The modified Eqs. (3) become
(7)
xtX<:
x i,f, n
+ Tj, k - 7j_i, a- *2 M h k , jej, keKr ,
with the understanding that 7o, a = 0.
Additionally, it is necessary to add a second set of vehicle constraint equations to insure that the
amount of slack capacity does not exceed the total possible maximum vehicle productivity in a given
time period.
(8)
2 -jfF^^Mj,*, jeJ,keKt
If it is also desired to restrict greater than average vehicle utilization to a single time period, the
following constraint equations are used instead of Eqs. (8):
(9)
n(M hk - X JT J1 ^)-T J ,^0, jeJ,keKu
RESOURCE ALLOCATION FOR TRANSPORTATION 481
where
CI— a positive constant.
Equation (9) will cause the slack variable Tj,k to vanish (due to the nonnegativity restriction on linear
programming variables) should the maximum productivity be reached. The term in parentheses acts
like Eq. (8), and the value of the constant, Cl, determines the maximum value of slack that may be
carried forward.
If vehicles are allowed to operate at their maximum utilization rate for two or more successive
periods (if sufficient unused slack capacity is available), the variable 7j,a in Eq. (9) may be replaced
by Tj + m-i,k, where m is the number of successive periods allowed.
ALTERNATIVE FORMULATIONS
Alternative formulations and modifications were investigated during the formulation of the linear
programming model described above. Two of these warrant discussion, since they highlight future
areas of research or give a better understanding of why some models have not been developed in spite
of their theoretical tractability.
Route (Arc-Chain) Formulation
The previous formulation may be described as a route (arc-chain) formulation. The variable Xi,j, „
corresponds to the flow over the nth route and the constraints upon the vehicles employed on the route
are generated by entering the variable x%,j, n into each of the appropriate vehicle constraint equations,
using the implicit assumption that no storage is allowed at any of the transshipment nodes and that
the next vehicle is available when required.
Node-Arc Formulation
A node-arc formulation defines a new variable Xi,j, Q corresponding to a shipment on the qlh vehicle
oriented arc carrying requirement i in the 7th time period. Using this formulation, it is possible to con-
sider in-transit storage of requirements as long as enough of each requirement i is delivered at the initial
node of arc q before it can be transshipped over the arc. This requires additional constraint equations
at each transshipment node m of the form:
(10) ^ Xi,y, q ^ ]£ Xi.y.q, ieljej,
xeB(m) xtA(m)
where
B{m) — the set of shipments of requirement i into node m that are available for transshipment in
or prior to time period/, and
A (m) = the set of shipments of requirement i out of node m in or prior to time period/.
This is just an inequality form of the conservation of flow equations, with the excess of input over output
taken as storage at the mth node. In addition, it is possible to place an upper bound upon the quantity
of goods being stored.
The node-arc formulation has two attractive features: (1) inventories at nodes are allowed;* and
(2) the number of variables only increases linearly with the number of route segments. The distressing
A limited capability to incorporate in-transit storage within the arc-chain formulation may be accomplished by defining
additional variables corresponding to the storage of a requirement at a transshipment node for a fixed number of time periods.
482 G. BENNINGTON AND S. LUBORE
feature of this formulation is the increase in the number of constraint equations required. An ad-
ditional constraint equation is required for each transshipment node, for each requirement, in each
time period. The linear program must manipulate a larger basis, and may require additional auxiliary
storage, which sharply increases the computation times.
For the above reasons the route formulation was chosen for implementation. The additional
capability of a node-arc formulation appears to be outweighed by the increased computational burden.
Although the number of possible routes in a graph can literally explode as the complexity of the net-
work increases, it does not seem to be a practical limitation, since many sets of routes can be excluded
a priori, and the computational effort resulting from the additional variables does not seem to be
prohibitive.
REPOSITIONING OF VEHICLES
The previous formulations have explicitly made assumptions about the distribution of vehicles
to shipping nodes in the generation of the vehicle constraints. A common assumption is that a vehicle
"cycles" back to the point from which it left. Under this assumption a vehicle used for a shipment
is constrained from further use until it returns to the shipping node. Upon its return, it is made available
for immediate reuse at any other shipping node in the network. For some problems this may be a serious
error. If a ship returning from London to New York is immediately reused at San Francisco without
a transit time delay to reposition to the West Coast, an unrealistic situation exists.
Vehicle repositioning may be taken into consideration by defining repositioning nodes m for each
vehicle and defining a new decision variable yj,k,m,m> corresponding to the number of empty vehicles
of the Arth type repositioned from node m to node m' in time period j. It is necessary to add additional
vehicle repositioning constraint equations to maintain a flow balance at each of the nodes m and m' .
CASE 1 : m is an unloading point for the Ath vehicle type:
(ID £ f^^^ 2 ».*.-.*. keKujeJ,
JctHim) " x >}'< A '> " ytA(m)
where
B(m) — the set of shipments using the Arth vehicle type that are available for repositioning from node
m in or prior to time period^, and
A{m) = the set of vehicles of type A: repositioned from node m in or prior to time periody.
CASE 2: m' is a loading point for the kth vehicle type:
x i ,.i' , ■
V ■• ,
(12) ^ yj'.ft.m.m'S* 2 T/.':f'," ' keKufef,
yeA(m') yeB(m')
where
A{m') = corresponds to the vehicles of type k repositioned to node m' for reuse in or prior to
time period j, and
B(m') — the set of shipments out of node m using vehicle type k in or prior to thejth time period.
In either case, the slack variable in the constraint equation corresponds to the inventory of vehicles
at node m or node m' .
Although these additional constraints seem to enhance the model, it must be remembered that
the model is a linear program with continuous variables. Hence, the number of vehicles repositioned
RESOURCE ALLOCATION FOR TRANSPORTATION
483
is not necessarily of integral value and the added capability for repositioning may not lead to a more
realistic solution.
COMPUTER MODEL
Computer programs have been written to generate the matrix elements corresponding to Eqs. (1)
through (9) with the exception of the maximum flow option, and to generate reports resulting from the
linear programming results. These programs are written in FORTRAN IV to be used in conjunction
with the Control Data ALEGRO Linear Programming System on a 3600 computer or to be used with
the IBM mathematical programming system (MPS) on 360/50 or 360/65 computer. The authors have
found that by inserting "artificial" variables with high objective function weights into the appropriate
constraint equations, the linear program running time is significantly reduced from the running time
resulting from the use of the two-phase method of the revised simplex algorithm.
Only limited computational results have been obtained to date, however Table II summarizes the
computational experience obtained while testing the model formulation on a set of sample problems.
All of the samples were run on an IBM 360/50 computer. The execution times do not include the time
for matrix generation or the reformatting of the linear programming solution.
Table II. Computational Results
LP Parameters
Problem c
escription
Problem
number
Rows
Columns
Density
(%)
Iterations
Execution
time (min)
Time
periods
Require-
ments
Routes
Vehicle
types
1
55
252
6.55
41
0.49
12
6
4
4
2
55
252
6.55
17
0.79
12
6
4
4
3
73
199
5.81
28
0.50
12
12
8
4
4
85
244
5.61
23
0.45
12
12
28
4
5
133
293
3.48
46
0.70
12
12
28
8
6
263
2924
1.32
136
4.91
17
74
108
13
7
390
862
1.54
107
2.67
25
110
35
12
8
390
1422
2.34
112
4.29
25
110
77
12
Problems 1 and 2 consider a network of only two nodes and four parallel routes between these
nodes. This network corresponds to a movement of requirements from a single POE to a single POD
utilizing four distinct types of vehicles. The rows of the constraint matrix are composed of 48 vehicle
constraint equations, one per vehicle type per time period, together with six requirement constraints.
The remaining row is the objective function.
Problem 1 minimizes the overall closure time of the movement, and Problem 2 minimizes
the weighted deviations of the movement requirements about their preferred delivery dates. Although
Problem 2 requires less iterations to achieve an optimal solution, it requires more computational time.
Behavior of this type is not atypical of the computational experience obtained with most commercial
LP codes when applied to small problems and is most likely due to the heuristic multiple pricing rules
or reinversion policy employed in the iterating agenda.
484 G. BENNINGTON AND S. LUBORE
Problem 3 is an extension of the first two problems and includes a second POE-POD pair, an en-
route base for aircraft, and an increased number of movement requirements. In attaining the minimum
deviation solution, the number of aircraft sorties through the enroute base was constrained, however,
transshipment was not allowed.
Problems 4 and 5 allow transshipment and include two origins, two POE's, one enroute base, one
POD, and one destination. The problems differ only in the number of constrained vehicle types. One
of the POE's and the enroute base are constrained. Problem 4 did not consider the origin to POE or
POD to destination vehicles to be limited in number. Problem 5 placed an upper bound upon the number
of these short haul surface vehicles available.
Problems 6 through 8 consider larger problems that are similar in structure to Problem 5. As
before, the number of iterations required and the computational times do not appear to obey any fixed
relationship with respect to the problem size in terms of the parameters of the linear programs.
REFERENCES
[1] Dantzig, G. B., W. O. Blattner, and M. R. Rao, "Finding a Cycle in a Graph with Minimum Cost
to Time Ratio with Application to a Ship Routing Problem," Operations Research House, Stanford
University, California, Report 66-1 (Nov. 1966).
[2] Fitzpatrick, G. R., J. Bracken, M. J. O'Brien, L. G. Wentling, and J. C. Whiton, "Programming the
Procurement of Airlift and Sealift Forces: A Linear Programming Model for Analysis of the Least-
Cost Mix of Strategic Deployment Systems," Nav. Res. Log. Quart., 14, 241-255 (1967).
[3] Laderman, J., L. Gleiberman, and J. F. Egan, "Vessel Allocation by Linear Programming," Nav.
Res. Log. Quart., 13, 315-320 (1966).
[4] Lewis, R. W., E. F. Rosholdt, and W. L. Wilkinson, "A Multi-Mode Transportation Network Model,"
Nav. Res. Log. Quart., 12, 261-274 (1965).
[5] Rao, M. R. and S. Zionts, "Allocation of Transportation Units to Alternative Trips — A Column
Generation Scheme with Out-of-Kilter Subproblems," Operations Research, 16, 52-63 (1968).
[6] Schwartz, N. L., "Discrete Programs for Moving Known Cargos from Origins to Destinations on
Time at Minimum Bargeline Fleet Cost," Transportation Science, 2, 134-345 (1968).
[7] Whiton, J., "Some Constraints on Shipping in Linear Programming Models," Nav. Res. Log. Quart.,
14, 257-260 (1967).
INTERVAL ESTIMATION OF THE NORMAL MEAN SUBJECT TO RE
STRICTIONS, WHEN THE VARIANCE IS KNOWN*
Saul Blumenthal
New York University
Bronx, New York
1. INTRODUCTION AND SUMMARY
We consider here the problem of obtaining confidence interval estimates of the mean of a normal
distribution when the mean is restricted to be nonnegative, and the variance is known. The problem
of point estimation for truncated parameters has been considered by Katz [13] and Farrell [9]. For
the interval estimation problem, we shall work with a loss function
(1.1) L{dJ)=c(8 2 (-)-8 1 (-)) + {° 1 ™>^°*w,
where the interval estimate /( • ) = [8i( • ), 82 ( • )] is closed with upper and lower end points 82 and
81, respectively, and 6 is the mean being estimated. The loss function (1.1) was used by Joshi [12] in
demonstrating admissibility of the standard interval estimate for the normal mean in the unrestricted
case. Dudewicz [8] considers this loss function in constructing interval estimates for ranked means.
For the unrestricted mean problem, Ferguson [11, p. 184] mentions this loss function, and variations
on it have been used by Aitchison and Dunsmore [1] where the expected size of the error rather than
the error probability appears in the risk function, and by Konijn [14 and 15] where the risk function
used the noncoverage probability and the probability of covering incorrect values.
The problem of constructing confidence intervals for restricted parameters has been considered
in some detail for the logistic distribution by Bartholomew [3] and has been discussed in general by
Stein [18] and Bartholomew [2].
In Section 2, we obtain Bayes estimates with respect to folded normal prior distributions, and
demonstrate the admissibility of the generalized Bayes interval obtained by taking a uniform prior
distribution on (0, °°). The admissibility is subject to the same equivalence class restrictions as in
Joshi [12] and the reader is referred to that paper for a full discussion of admissibility for interval
estimators. The risk functions of the generalized Bayes estimator and a natural truncated estimator
are discussed in Section 3, along with the question of minimaxity. The truncated estimator is shown
to be minimax. In Section 4, we consider sequential stopping rules where sampling cost is linear and
we find rules which are asymptotically pointwise optimal, and asymptotically optimal Bayes rules in
the sense discussed by Bickel and Yahav [6]. In section 5, we replace c by h{6) in (1.1) and examine
how the length of the resulting Bayes intervals is related to h{6).
*Research supported by National Science Foundation Grants GP-7024 and GP-7399, School of Engineering and Science,
New York University.
485
486 S - BLUMENTHAL
2. BAYES INTERVALS AND ADMISSIBILITY
The observations A\, . . . X n are independent with common normal distribution N(6, v 2 ), with
known variance v l . The averaged is then N(6, (v 2 /n)). To avoid continually writing (v 2 /n) , we define
(2.1) a 2 = v*ln, Y=X.
Corresponding to the loss function (1.1), we have the risk function
(2.2) R(d,I)=cEe(d,(Y)-d 1 (Y)) + l-P (8 l (Y)^d^8 2 (Y)).
To generate Bayes estimates, we let 6 have a normal distribution with mean and variance t 2 , confined
to 3=0, i.e.,
(2.3) g T (d) = (2l7TT 2 )^e- e2 l^ 2 6^0
= 6><0.
The posterior density of 8 given Y is
— — (e-yy--)-
(2.4) gx(0|y) = {<P(y/o-y)}-'{y 2 /277o- 2 }'/ 2 e 2 - 2 , >j
= 0<O,
where
(2.5) y 2 =l + (o- 2 /r 2 ).
For the purpose of finding the Bayes estimation interval, we characterize interval estimators by
a function £(0, y) , where
(2.6) £(d,y) = li£8 1 (y)^d^8 2 (y)
= otherwise.
LEMMA 2.1: The Bayes estimation interval for prior densities (2.3), loss function (1.1) is given by
(2.7) 8,(F) = max [0, M(y, Y) -D(y, Y)]
8 2 (Y) = max [0,M(y, Y)+D(y, Y)],
where
(2.8) M(y, F) = F/y 2
and
(2.9) D(y, F) = {(2o- 2 /y 2 ) log [(y 2 /277o- 2 ) '/^(F/o-y)]}'/ 2 ,
provided that
INTERVAL ESTIMATION OF NORMAL MEANS 487
(2.10) c< (7 2 /2ir<r 2 ) 1 ' 2 [«l>(I7<ry)]- 1 .
If (2.10) is violated, the interval shrinks to a point which can be chosen arbitrarily as any nonnegative
number.
PROOF: Using (2.6), we can write
£L(0,/) = 1+[ X g T (d)dd f* [c(8 2 (y)-8,(y))-£(0,y)]/(y|0)rf>r
(2.11)
= l+r My)dy[ X [c-g 7 (d\y)]t(0,y)dd,
Jy=-zc Jti = o
where Fubini's Theorem justifies changing the order of integration and the conditions for using the
theorem are verified as in Joshi [12, Lemma 5.1], and where
(2.12) My) = (2/7ry 2 T 2 )'/ 2 cp(y/o-y)e^ 2 / 2 ^ 2 >.
The expression in (2.11) will be minimized by minimizing the inner integral on the second line for
each y, namely by taking
(2.13) £r(e,y) = lifg T (d\y)^c
= otherwise.
Since g r (0\y) is unimodal and monotone on either side of the mode, it is seen that (2.13) and (2.7) are
equivalent. Condition (2.10) is needed to guarantee that g T (6\y) ^ c for some 6. This completes the
proof.
NOTE: Condition (2.10) assures that the cost due to length does not dominate the cost due to
nonconverage. Since any point estimate has zero length and zero coverage probability, the degeneracy
in the location arises. Because condition (2.10) depends on F, it is possible for small F values to lead to
nondegenerate intervals, while larger F values lead to the degenerate situation. To avoid this, it seems
desirable to add a stronger condition assuring nondegenerate intervals for all observations. Noting that
(l^Tro- 2 ) 1 ' 2 ^ (y 2 /27ro- 2 ) 1/2 ^ (y 2 /27ro- 2 ) 1/2 [$(y/o-y)] "'■
we can eliminate the dependence on F by using (y/2Tro~ 2 ) 1/2 as a bound for c. This still leaves the
uniqueness dependent on t (through y), and when we take limits as t— » °°, we will want c such that a
nondegenerate solution holds for allr. This suggests imposing the condition
(2.14) c< (1/27TO- 2 )'/ 2 ,
which we shall do henceforth.
Even under (2.10), it is possible to have a degenerate interval at zero because M + D may be
negative. The requirement that M + D > 0, all F reduces (for F < 0) to
(2.15) (ca V2Wy)<P(y/o-y) < e - ( ^ 2 ' r ^ 2) .
488 s - BLUMENTHAL
Since it is always true that <i>(x) < e~ x2 l' 2 (x =s 0), it is seen that the stronger condition (2.14) assures
M + D > 0, so that no point estimates will be incurred. Condition (2.10) alone does not insure (2.15).
We summarize the above remarks in Corollary 2.1.
COROLLARY 2.1: The Bayes estimation interval I r for prior densities (2.3), loss (1.1) and (2.14)
are nondegenerate for all Y and t, with 8i(Y) given by (2.7), and
(2.16) 8 2 (Y) = M(y, Y)+D(y, Y) > 0.
We shall now study the generalized Bayes rule obtained by taking a uniform distribution on
(0, °°). It is obtainable from the previous results as indicated below. We denote it by/ p .
COROLLARY 2.2: Under the conditions of Corollary 2.1, the uniform generalized Bayes interval
estimate is obtained by letting t— » °°, and is given by
(2.17) 8,(r) = max [0, Y-D(Y)]
8 2 (Y) = Y+D(Y),
where
(2.18) D(Y) = {-2o- 2 log [c<TV2^-4>(y/o-)]}" 2 .
Next, we study the risk function of the Bayes estimator.
LEMMA 2.2: The risk function of V is given for d ^ by
(2.19) R(6, I T ) = dy^[6<S>(yu(y) - (0/cr)) - (cr/ V^e-./^M-Wa))*]
+ 2V2(o-/y) J* [-iogcV2^-(o-/y)cD(y- 1 (y+(e/o-)))] 1/ V(y)^y
r- f[yu(y)-(eio-)) . 1
- V2(<r/y) J [-logcV2^(o-/y)<P(y- 1 (y+(^/cr))] 1/ V(y)^y]
+ l-{<P(yu i (y, 6)-(dla))-<P(yu 1 (y, 0)-(0/<r))},
where Ui(y, 6) and u 2 (y, 6) are the two solutions of
(2.20) <P(*) = (ylca)ip(x - (yd/a) )
and u(y) = u 2 (y, 0) is the unique positive solution of (2.20) for = 0.
PROOF: Since D(y, Y) > a.s., it is immediately obtained that
(Y/y 2 ) ^ D(y, Y)(i.e., 8,(7) > in (2.7)) if and only if <D(y/y<r) 2* (ylc<r)<p(Yly<r), Y ^ 0.
Furthermore, ^(u) is a strictly increasing function of u while <p(u) is a strictly decreasing function
on [0, °°]. Hence,
(F/y») 5= D(y, Y)<^>(Ylya) > u(y).
Therefore,
INTERVAL ESTIMATION OF NORMAL MEANS 489
Ee{82(Y)-8 l (Y)}=2E e {D(y, Y)I [y<TU(y)M {Y)} + E e {{M{y, Y) + D(y, Y))h- m .y„ m (Y)}
2V2cr
y
f ™ , / c V2ttct , ( u
cp( ~ + _x_
y cry
1/2
<p(u)du
y J-oc ly- y- L \ y \y cry//j J
It may be immediately verified that this yields the terms in (2.19) which are multiplied by c.
Write the noncoverage probability as P e [8 2 (Y) < 0] + P e [8i(Y) > 5= 0]. The first of these can be
written as
P{M + D < 0} = P{D 2 < {S-M)\ d> M} =P{cp(y/cry) > (ylc(r)<p((YI<ry) - (y0/cr)), Y < y 2 0}
and similarly for the other probability. Combining these, gives as the coverage probability
(2 . 21) ^ r(y ^r )1 <(OT/ 4
Note that Eq. (2.20) does have two solutions when > 0, since (2.14) assures [<p(x— (yd/a) )/<t>(x)]
> (ca/y) atx= (y0/cr) , and lim [<p(x- (ydla))l<&{x)] = lim [<p{x - (ydl<r))l<&(x)] = 0. The probp
bility statement in (2.19) follows from (2.21) and straightforward manipulation.
NOTE: The risk function of I p is obtained by setting y = 1 in (2.19).
We shall now examine the difference between the Bayes risks of I T and I p . Let
(2.22) r( T ,I)= j R(d, I)gr(6)dd
denote the Bayes risk of an estimation interval /. In order to bound this difference, we must first look
at the functions in(y, 6), i = 1, 2 defined by (2.20).
LEMMA 2.3: The solutions Ui(y, 9) of (2.20) satisfy for all 0,
(2.23) My, 0) ~ (70/°-) |> ^logty/co-V^)]" 2 , £=1,2.
Also \iii(y, 0) — (y0/cr)| is monotone decreasing as increases, with a limit of [2 log (y/ccrV27r)] 1/2 .
The functions Ui(y, 0) are monotone increasing in 0(i = 1, 2). For all sufficiently large, (0 > 0*, say)
(2.24) I My, 0) - (yd/o-) | < [2 log (y/co-V^)] 1 / 2 ^ [(2ccr/y)e- ,/2(v8M ] 1/2 .
Finally, the functions u(y) = u>(y, 0) satisfy for all t i? 1,
(2.25) < u(y) - u(l) < (cr/2c) (1 + (cr 2 /2) ) (1/r 2 )
The proof of this and the following lemma will be given in the Appendix.
LEMMA 2.4: The difference in Bayes risks between / p and P is 0(t" 2 ), i.e., for some B
(2.26) T 2 [r(T, I p )-r(r, P)] < B, for all t.
Not only is the left side of (2.26) bounded, but it also has a limit as t — * °°, which we state below.
COROLLARY 2.3: The limit as t -> °° of the left side of (2.26) exists and is given by
490 s - BLUMENTHAL
(2.27) Jim r 2 [r(r, /") - r{r, V) ] = ca4 (I + V2) [- log ca V2^] " 2 + ( Wo" 1 ) [~ lo S c °" V§W]-W
The proof of this Corollary is given in the Appendix.
Next, we establish our main result,
THEOREM 2.1: Let I p be the interval estimator (2.17), and /' any other estimator such that
(2.28) R(d, I')^R(dJf) for all ^ 0.
Then /' is equivalent to I p , i.e.,
tj'(x, 6) =£»(%, 6) for almost all (x, 6)eRx (0,oo),
where tj(x, 6) is defined by (2.6).
NOTE: With respect to the loss function (1.1), Theorem 2.1 establishes the admissibility of I p
among all confidence set estimators up to an equivalence class of procedures. Without the qualification
of "up to an equivalence class," or some restriction on the class of confidence set procedures, no
interval estimator can be admissible. This is due to the discontinuous nature of the loss function (1.1)
so that any estimator / can be modified on a set of two-dimensional Lebesgue measure zero to im-
prove its coverage probability on a set of 6 values having measure zero. (Joshi [12] discusses this in
some detail.) Theorem 2.1 shows that if (2.28) holds, strict inequality can hold only on a set of 6 values
having Lebesgue measure zero, i.e., "almost" (or "weak") admissibility, but the conclusion of equiva-
lence of the estimators is a stronger one than the conclusion only of "almost" admissibility.
If the class of procedures is restricted to those which yield closed intervals for every observation,
then Theorem 2.1 establishes admissibility without any qualifications regarding equivalence classes.
Further, Theorem 2.1 establishes the "strong admissibility" of I p according to the definition used
by Joshi [12, Section 7].
PROOF OF THEOREM 2.1: We follow the line of Joshi's [12] argument, making appropriate
modifications.
Let
(2.29)
v/here
U<0(y) = i + £" [c-g x (e\y)]e n (y, d)dd = E{L(0,I^)\Y = y}(i=l,p),
(2.30) g x (0|y) = {cp(y/o-)}->{277o- 2 }- 1/2 e-i< fl -'" i .
Clearly U (,) (y) is minimized when
(2.31) £ (i) (y, 8) = lforg m (0\y)&c
= for gA&\y)<c,
i.e., for / <p) . Thus, either
(I) U (I) (y) = U (p) (y) almost ally in R
or
(II) U (ll (y) > U <p) (y) for subset S of/? with positive measure.
INTERVAL ESTIMATION OF NORMAL MEANS 491
If (I), then the desired equivalence can be established as on page 1065 of Joshi [12]. We show that II
leads to a contradiction.
Note that II implies that there exists a positive number k such that for some a
(2.32) J <P(y/o-)[U <1, (y)-U , " ) (y)]^y=A.
We now show as in Joshi [12] that A: = 0, achieving the desired contradiction. Using (2.11) we see that
(2.33) £{L(6>, /<>>)-L(0, /<">)} = Wiry^ 2 ) 1 ' 2 f" G r (y)dy,
where
(2.34) G 7 (y) = e-<" 2 ^^»cp(y/o-y)|[c(8<>»(y)-Si"(y))-| () X ^(0|y)^"(y, d)dd]
- [c(^»(y)-«W(y)) - f * *■(* | y)^>(y. 0)d0]},
and g T (d | y) is given by (2.4). The limit asr^" of g T (0 | y) is gx(0 | y) of (2.30), and the Helley Bray
lemma justifies taking the limit inside the integrals of (2.34) so that by (2.29) we find
(2.35) limG T (y)=<P(y/o-)[U (1| (y)-U ( P»(y)].
Next we justify the interchange of limit and integration
(2.36) lim J ' G T {y)dy= \" lim G 7 {y)dy=k
t— »°» J —0 J —n T •*"
by showing that | G T (x) \ < G(x), in particular
(2.37) | G T (y) | < G(y)=c{(8^(y) -8\»(y) + (8^>(y) -8^(y))}+2.
By definition (2.17), [8i p) (y) ~8[ p) (y)] is bounded uniformly. Now consider
(2.38) c \ " [8<»(y) -a<»(y)]«fy=c f " [8[y(y) -8<'>(y)] — L— e-W*"*.
J -a ' J -a VlTT a
• (V5iro-)e ( » 2 / 2tr ^y
go- V2W'' 2 ^ 2 >£ w=0 {c(8< 1, (y)-8 , 1 1) (y))}
go- V2^fe ( " 2/2<r2) «(0, /c>)go- v^e ( " 2/2cr2) /?(0, /<">),
the last inequality from (2.28). Thus G(x) is integrable over (—a, a), and from (2.36) we conclude
that given e > 0, there is a To s.t. for all t > To
(2.39) f G T (y)dy>k-e.
J -a
Now we observe that from the construction of a Bayes procedure,
(2.40) G T (y) ^e- ( « 2 /^ 2 »cp(y/o-y)|rc(8|(y)-8[(y)) - [ * g T (0 | y)f'(y, 6>)</0
c(s«">(y) -8</"(y)) - J * *r(0 1 y)£ (p, (y, 8)do} = D T {y) g 0.
492 S. BLUMENTHAL
Thus,
(2.41) (2/7ryV)'/^|"V|^G 7 (y)rf y ^(2/7ryV)'^(T"''+r"W(y)^
i? (2/7ryV)>/ 2 i D T (y)dy=r(r, I T )-r(r, I").
J — zc
By using (2.34), (2.39), and (2.41) and assuming t„ > 1,
(2.42) r(r, /'") -r(r, /p) ^ T ^ ( ( *~* } 2) ~ [r(r, /p) -r(r, !>)].
By (2.28), r(r, / (,) ) -r(r, / p ) < 0, so that
(2.43) k =i W( ^"°" 2) r[r(r, /p) - r(r, L)] + 6.
From Lemma 2.3 and the arbitrary nature of e, it follows that A: = and the proof of the Theorem is
complete.
3. RISK OF Ip AND MINIMAX ESTIMATION.
We shall now investigate the behavior of the uniform generalized Bayes interval as a function of the
observed mean Y, and the risk function of I p . We shall introduce a "truncated" estimate and show that
it is minimax. It has not been possible to demonstrate the (conjectured) minimaxity of I p .
LEMMA 3.1: The length of the interval I p is a unimodal function of Y with maximum of
(3.1) 2{-2cr 2 log [co-V2^D(a(l))]}" 2
at Y=au(l), where u(l)(>0) is given by (2.20) withy= 1, = 0.
The length has a limit of
(3.2) 2{-2o- 2 log [CO-V277]} 1 / 2
as Y increases, and behaves like
(3-3) (o- 2 /|F|) log \Y\
as Y decreases to (— °°). Both limits, 8i(y), d-ziy), are monotone increasing functions of Y, and 62(F) > 0,
ally.
PROOF: From the characterization (2.30) and (2.31), we see that the endpoints 8j(y) (£=T, 2)
of I p satisfy the equation
(3.4) {2tto- 2 } ,l2 e~^ »-y)' = c4> {yja),
so that (8 — y) decreases monotonely as y increases. The length is 2(8 — y) if y^cru(l), where u{\)
is defined by (2.20), and the length is just 8 2 (the larger root) if y < cru(l). The assertion that 8» is
monotone increasing with y is equivalent to the assertion that the equation
INTERVAL ESTIMATION OF NORMAL MEANS 493
(3.5) &(x)-(ca)<p{a-x)=0, a >0
has only one solution x with x < a. The latter statement follows by noting that (3.5) is positive and
increasing for x<a — y 2 (say), and is decreasing for a — y> < x < a — y-i (say), then is increasing for
a — y\ < x, and is negative at x = a.
The monotonicity of 8 t is easily verified by implicit differentiation of (3.4). Putting 3>=1 in (3.4)
gives (3.2), and using the tail approximations for (1 — <1> (jc) ) when x is large (Feller [10, p. 166]) gives
(3.3). This completes the proof.
Next we look at some aspects of the behavior of the risk function R(d, I p ). Let 7° denote the in-
terval with end points Y±{ — 2cr 2 log [ccr V2rr] } 1/2 which is the Pitman type estimation interval for
unrestricted 6 and was shown by Joshi [12] to be admissible and minimax with constant risk
(3.6) 7? = 7?(6>, 7°)=2[c{-2c7 2 log [ca V2ir] }" 2 + 4>(- {-2 log [ca VYtt\ J 1 ' 2 )].
Comparing the risks of the two estimators, we obtain:
LEMMA 3.2: The risk function R (6, I p ) is discontinuous at 0=0 with
R(0-, I*) = 1 + c\2V2<t J [- log ca V¥n-<P(y)yi 2 <p(y)dy
(3.7) -Via!" 1 [-\ogcaV2^<$>(y)yi 2 <p{y)dy-(T<p(u(l))\
= R + ®(u(l))[l- {ca)(ca+ u(l))] >1 > R.
(3.8) 7?(0, Ip)=R(Q+,Ip)=R(Q-, /p)-<D(u(1)) = R - (ca)<P(u(l))[ca+ u(l)] < R,
and
(3.9) lim R(0,I p )=R.
0- +0C
PROOF: The expected length of I p is a continuous function of 6, but the coverage probability
drops to for 6 < 0. This accounts for the first part of (3.8) and (3.7) which represents R(0— , I p ) as
unity plus the expected length at zero. The other equality of (3.7) comes from using the change of
variables v={— 2 log ccrV27r<l>(y)} 1/2 to evaluate the integrals. The first inequality comes from the
positive character of expected length. The second comes from the fact that any point estimator has
constant risk of unity, and 7° being admissible must have smaller risk. By taking limits inside integrals
in (2.19), and using Lemma 2.3, (3.9) is verified, completing the proof.
NOTE: The argument used to establish the first inequality in (3.7) can be extended easily to show
(3.10) R(0, I") > 1 >7? all0<O.
The preceding lemma suggests very strongly that the risk function of 7 P is =7? for all i? 0.
However, we have been unsuccessful in attempting to prove this boundedness. The motivation for
desiring it is the fact that any estimator whose risk is so bounded is minimax. This minimax property
can be established by essentially the same proof as is used in Blumenthal and Cohen [7, Theorem 3.1].
494 S. BLUMENTHAL
One estimator which is minimax is the obvious "truncated" estimator 7 7 based on 7°, namely
(3.11) 8f(y) = max [0, Y-D]; 87(F) = max [0, Y+D]
where
(3.12) D= {-2a 2 log [co-V^]} 1 ' 2 ,
with the interval becoming a point for Y <— D. Clearly, for > 0, del <-> 0e7 r so that both intervals
have the same coverage probability and I T is never longer than 7°. Thus R(d, I T ) = R, all and I T
is minimax. The risk function of I T is given by
(3.13) R(6, I T ) = ca{D[l-(t>(D-(dl(T))+<i>(D+(dla))]+(T[e(D+ (8l<r)) -<p(D-(9l<r))]
+ e[&{D+{dlo-)) + &{D+(dl<r))-l]} + \-[Q(D)-(l-h(0, 0))<P(-7))]
= R-ca{[®(D-(dl<r)) + ®(D+(dl<r))][D-6]
+ d-a[<p(D+ (dla))-<p(D-(dl&))]-h(0, 0)®(-D),
where
LU otherwise.
From (3.13), we see that lim R(8, I T ) = R.
Although I T is minimax, it is not likely that it is admissible since its length is not a continuous function
of y. Clearly, generalized Bayes intervals will have continuous lengths so that I T would not be in the
generalized Bayes class. If this class is complete as is the case for point estimation (see Sacks [17]),
then I T could not be admissible.
We close this section by taking a brief look at the coverage probability of I p as a function of 6.
LEMMA 3.3: The coverage probability of l p given by
(3.14) P(0) =<1>(M1, 6) - (0/o-)) -<D(a,(l, 0) ~ {filar)) %
is monotone decreasing in 6 with maximum (for 5* 0) of <I>(u(l)) and minimum of (<i>(L) —3>(— /_,)),
where the u's are given by (2.20) and
i
(3.15) L={-2 log [co-V2^]} 112 .
PROOF: The proof follows directly from using Lemma 2.3 in (3.14) and (3.14) comes directly
from Lemma 2.2.
To compare with the case of unrestricted 0, let the coverage probability in that case be (1 — a).
Then! satisfies <t>(L) — <£>(—L) = (1 -a) , and from (3.15), ccr = (p(L). Thus, from (2.20), u(l) satisfies
(3.16) Ma)/<D(«))=<D(L)
andP(0) decreases from <P(u(l)) to (1 — a) as increases. From (3.15) we see that u(l) > L, and if
a is small so that L is large, 4>(u) ~ 1 and u(l) =L, giving max P(0) = 7^(0) =<t>(L) = (1 — (a/2)).
(In fact, P(0) > (1— (a/2)).) Also, from above we see that <p(u) =$>(u)(p{L) ><S>(L)(p(L) so that
INTERVAL ESTIMATION OF NORMAL MEANS
495
u(l) < u* where <p(u*) =<$>(L)<p(L) = (1 - (a/2))<p(L), and P(0) <<P(u*). Numerically, we consider
two examples, a = 0.05 and 0.10.
a
D
1 - (a/2)
P(0) = «D(u(l))
4>(u*)
0.05
1.96
0.975
0.9755
0.976
0.10
1.64
0.95
0.952
0.953
It seems evident that for the usual ranges of a values, (1 — (a/2) ) is quite sufficient as an approximation
toP(0). We might also note that the above method of bounding P(0) suggests an iterative procedure
for solving (3.15), namely define u, + i by
<p(u i + i )=<b(ui)<p(D), £ = 0,1, . - . (tio=L).
This gives a sequence whose even (odd) numbered terms are less (greater) than the desired solution
and whose even (odd) terms are monotone increasing (decreasing) to that the sequence converges
to u(l).
4. THE POSTERIOR BAYES RISK AND SEQUENTIAL PROCEDURES
In this section, we shall consider the posterior expected loss given the observations for the Bayes
procedures of Section 2. We shall study a sequential rule which is in effect a curtailment of the fixed
sample rule and which guarantees a uniform bound on R(6, 1). Also for linear sampling cost, a sequen-
tial Bayes rule which is Asymptotically Pointwise Optimal (A. P.O.) in the sense of Bickel and Yahav
[4-6], and asymptotically optimal will be given.
The posterior expected loss for the Bayes procedure is given by the minimum value of the inner
integral of (2.11), namely
(4.1) r(Y) = r+(Y) = l + 2cD(y,Y)-[<P(M(y,Y)lC)]->[2<P(D(y,Y)IO-I] iiY^ayu(y)
r(Y) = r-(Y) = l+c[M(y, Y) +D(y, Y)]- [4>(M(y, F) /£)]-'[<!> (D(y, ¥)IQ
-cp(-M(y, Y)IQ] ify^o-ya(y),
^here
(4.2)
£2 = or*/y 2
and u(y) is defined by (2.20), y by (2.5), M(y, Y) by (2.8), and D(y, Y) by (2.9). We shall write simply
M and D where this causes no confusion.
By straightforward differentiation, we find that r + (Y) is a monotone increasing function of Y with
a maximum of
(4.3)
2[c£{ -2 log c£V27r}</2 + <D(-{-2 log c £ V2ir}>/ 2 )]
The square bracketed term in (4.3) goes monotonely to zero as £ — * (i.e. as the number of observations
n, increases). The function r~(Y) is also monotone increasing in Y with minimum of at Y= — °°. Thus
the posterior risk r(Y) is monotone in Y with maximum (4.3).
496 s - BLUMENTHAL
If a Bayes confidence interval is sought in the case of unrestricted 6(—°° < 6 < °°) relative to the
prior density (2.3) also on the whole line, with loss (1.1) the resulting estimate M(y, Y) ±D(y, °°) has
constant posterior expected loss (4.3). Thus if a sample size no is chosen so that this interval estimate
will have Bayes risk ro, the same risk can be achieved in the case of restricted 6 by a sequential rule
which stops as soon as r(Y) < ro, and the sequential rule will never take more than no observations.
We shall now consider the stochastic limit as n increases of the random variable r(Y). From this,
it will be possible to construct A.P.O. sequential stopping rules.
LEMMA 4.1: Let the random variable r(Y) be the posterior Bayes risk for interval I 7 . Then
(4.4a) lim (n/log n)" 2 r(F) =0 if 6 < 0, w.p.l,
(4.4b) lim (ra/log n) 1/2 r(F) =2c*> if 6 > 0, w.p.l,
n— * o°
(4.4c) lim {nl\ognyi-r{Y)=cv if = 0, w.p.l.
PROOF: From (4.1), (4.2), the definitions (2.1), (2.5), (2.8), (2.9), and the tail approximation to the
normal, we find
(4.5a) lim (rc/log n) 1/2 r+(*) =2c^, x > ( (log n)/n) 1/2 ,
n —* oo
(4.5b) lim (n/log n) 1/2 r-(*) =0, x < 0, x fixed,
(4.5c) lim (nl\ogn) i l*r-(x)=cv[(a 2 + l) ll2 -a], x = ~av{ (log n)/n) J / 2 , a > 0,
(4.5d) lim (ra/log n) 1/2 r"U) =ci^(a+ 1), x = av((\ogn)ln) l l 2 , a > 0.
From the definition (2.20), we find
(4.6) ayu(y) = v V(log n)\n+o ( V(log n)jn) as n increases.
Using the fact that Y^d w.p.l as n increases, the fact that r(Y) = r + {Y) if Y > ayu{y) , (=r~(Y)
otherwise), along with (4.5a), (4.5b) and (4.6) we see that w.p.l r(Y) = r + (Y) if 6 > 0, (= /- (Y) if 6 < 0) ,
and that (4.4a) and (4.4b) hold.
For 6 = 0, we use the law of the iterated logarithm to conclude that w.p.l, |F| < 2V(2 log log n)/n,
so that w.p.l, (n/log n) ll2 r(Y) is given by either (4.5c) if F< or by (4.5d) if Y > 0, with a = 0. Both
expressions agree in that case, giving (4.4c), and completing the proof.
It is immediately evident from the proof above that
(4.7) sup (n/log n) "*E(r (Y) ) < oo,
n
where expectation is with respect to the marginal density / 7 (y).
We now consider the construction of sequential Bayes stopping rules where the cost per observa-
tion is a constant, Co, with stopping cost at stage n defined as
(4.8) Con + L(6,I)
with L{6, I) given by (1.1). Proceeding as in Bickel and Yahav (1968), we define the stopping rule: stop
for the first n such that
INTERVAL ESTIMATION OF NORMAL MEANS 497
(4.9) r(K)[l-((nlog(n + l))/(/i + l) logn)^]^C
which is obtained by writing r(Y) as ((log n)/n) U2 V, and finding then which minimizes (nC + r(Y)).
From Lemma 4.1 and (4.7), it is clear that the conditions of Theorems 2.1 and 3.1 of Bickel and Yahav
[6] hold if (n/log n) 1/2 is read in place of n®. This substitution causes no difficulties in the proofs of these
theorems. (The requirement that V > is met since none of our distributions put any mass on 6 < 0.)
Expanding the expression in (4.9) we obtain the approximate rule:
(4.10) stop as soon as (r(Y)l2N) == C .
Summarizing the conclusions indicated above, we have
THEOREM 4.1: For the loss (4.8), the stopping rules (4.9) and (4.10) are A.P.O. and asymptotically
optimal in the sense of Bickel and Yahav [6].
5. COST DUE TO LENGTH WEIGHTED BY A FUNCTION OF THE MEAN
In this section, we take a brief look at the intervals which result from using a loss function of
the form
(5.1) U0,I) = h(d)(8 2 (Y)-8 l (Y)) + {
if 8 l (Y)^d^8 2 (Y),
1 otherwise
where the relative importance of the length of the interval depends on the size of the unknown mean
being estimated, and the constant c of (1.1) is absorbed in h{6). Proceeding as in Lemma 2.1, we find
the Bayes solutions to have the form
(5.2) f(0,y) = l i{gA0\y)^E[h(d)\y],
— otherwise
where
(5.3) £[M0)|y]=r h(0)g T (0\y)de,
J 0=0
and g T {6\y) is given by (2.4). This leads to intervals of the form
8,(y) = max [0, M(y, Y)+G(y, Y)]
(5.4)
S 2 (F) = max[0,A/(y,F)+G(y,y)],
where M(y, Y) is given by (2.8), and
(5.5) G(y, F) = {(2cx 2 /y 2 ) log [(y 2 /27ro- 2 )" 2 /£(M0) \Y)<P(Y\ay)]}^.
For a solution of (5.2) to exist, some restriction on h{6) is needed, such as
sup h(d) ^ (2tto- 2 )- 1/2 ,
O=S0<x
which is quite severe, but which also assures that 82(F) > 0, all Y. As we vary h(d), we can obtain
quite different properties for the resulting interval, the effect of h(6) entering through the half-length
G(y,Y).
498 S - BLUMENTHAL
The requirement that the 8, be monotone can be written as
G(y,Y)-M(y, Y) + [B(Y, 6h(6))/B(Y, h(d))] -0 for 8,,
(5.6)
G(y, Y)+M(y, Y)-[B(Y, dh(d))/B(Y, h(d))] >0 for 8,,
where for any function r(9),
(5.7) B(Y, r(d))= I* r(e)e L <y 2 l 2 « 2 " e -"iy< Y » 2 dd.
Jo
If it is desired to chose h(d) so that the length of the interval behaves for large M approximately like
a given function /(M), h(6) must at least approximately satisfy
(5.8) B(Y,h(d)) = e -iy^W).
Note that the condition 8 2 (Y) > can be written
(5.9) B(Y, h(6)) < e-W"^' 2 for Y < 0.
For/(M) =kM (A. any constant), (5.8) and (5.9) may conflict. From (5.7) we expect as a first approxima-
tion that for large M, B(Y, h{d)) = h(M), so that in view of (5.8), we would try
(5.10) M0)«e-W
for some A > 0, in order to have length approximately/(M) for large M. WhenM is negative, the length
is not 2C(y, Y) but M + G and will not necessarily be of the form/(M). In fact, asM— » — °o we generally
want (M + G) — » 0, regardless of the form of the length for larger M.
Paulson [16] suggests a rationale for desiring a half-length of order \M(k < 1) for large M and ap-
proaching for small M in a problem where 6 represents the mean of a difference between two normal
variables. We shall construct here an interval having this property by choosing h(d) =ce~ A92 . The
resulting B(Y, h(6)) is
(5.11) c(27r^/l + Z4C 2 ) 1/2 e- ,A,2/(1 + 2 ^ 2, cp(M/^Vl + 2^^),
where
(5.12) £ 2 = cr 2 /y 2 .
Since
(5.13) G(y, Y) = {-2C 2 log B(Y, h(0))Y'\
we see that for large M , the interval length is to a first order approximation
(5.14) 2(2A^I(l + 2AC 2 )) m M.
Using the tail approximation for <t>(—x) as x increases, we find that for K— » — <»
(5.15) 8AY) = CH\og \M(y, Y)\)l\M(y, F)|+0(1/|F|)
INTERVAL ESTIMATION OF NORMAL MEANS 499
which is the same as for h(6) = c in (3.3). From (5.14), we see that A can be chosen as
\2/2£ 2 (1-a 2 ).
Rewriting B{Y, h(0)) as
(5.16) e-*Wl [ X fc(0)e-««»+*MV«Wd0
for M < 0, and noting that the integral in (5.16) is (£'-/i(0)/|M| ) plus terms of smaller order asM-* — oc ,
we see from (5.13) that 8>(Y) will always have the order of magnitude indicated in (5.15) as Y —* — °°.
Sequential procedures can be developed from the posterior expected loss function for these loss
functions just as we did in section 4 for the loss function (1.1).
APPENDIX
Proofs for Section 2
In this Appendix, we supply the proofs of Lemmas 2.3, 2.4, and Corollary 2.3, We start with
PROOF OF LEMMA 2.3: Write v = u - {yd/a) in (2.20) so that we are looking for the two solutions
(one positive, one negative) of
(A.l) <&[v+ (ydla-)] = [yica)<p{v).
The right side of (A.l) exceeds the left at y = 0. At y = ± [2 log ( Vy/c V2~7r) ] 1/2 , we have unity for the
right side, which exceeds the left side. Since <P (x) is monotone increasing, and <p (x) increases for x < 0,
and decreases for x > 0, (2.23) is proved. Noting that as 6 increases 4>[f + (yd/cr)] increases mono-
tonely to unity for each v, gives the monotonicity and limit results for f,(y, 6). The monotonicity of
ui(y, 6) follows from the defining relation (2.20).
If Ui{y, d) > 0, then 4>(w,) > 1 — (<p(ui)lu,i), so that for v satisfying (A.l)
(A.2) (y/co-Mi;) > 1 - (?[t;+ (yd/(r)]l[v+ (yd/a)])
or rearranging and taking logarithms
e -l/2[(y0M 2 + 2i>(y0M]-] y 2
SO
that
- log V*n- + log (y/ca) + log 1 1 + (ca/y) [|;+(y g /(r)] } >
log (y/co- V2^) + («r/y) [t;+(y9M] > ,
and since \/A + fi < V4 + VB,
„-(l/4)[(y#/ < 7)2 + 2iXye/o-)]
(A.3) [2 log (y/orVS)]* + (2co-/y)" 2 [|>+ (yg/<r)]lft > | V |.
Because f,(y, 0) has a limit, for d > 9* (say), ^(y, 0) > (— k) (specified k) (and [(yd/a) — 2k] > 1)
so that
500
S. BLUMENTHAL
[2 log (y/co-V2^-)] 1/2 + (2co7y) 1/2 e-" 4 <^> > \v\
which proves (2.24). From (2.20), we have
<S>(u(y)) -<D(u(l)) = (llc<r)(y<p(u(y)) - <p(u(l)) > 0.
By Taylor's expansion
where
so that since
and
(«(y) -«(1)) = (l/ar)(^(u(l))/^(u))[y(*»(tt(y))/#»(a(l))) - 1],
u(l) *£ u *£ «(y),
M«(l))/y?( tt )) < (v(«(D)/y«p("(y)))<i,
(A.4)
Mu(y))Ap("(D)) < 1,
(u(y)-u(l)) < (l/co-)y[y-l] < (1/ccr) (1 + (o- 2 /2r 2 )) (<x 2 /2t 2 )
using Vl + x <1 + (ac/2). When t 5= 1, (2.25) follows from (A.4). This completes the proof of the lemma.
Next, we give
PROOF OF LEMMA 2.4: We break the difference
(A.5)
r 2 | o X [R(0, 1") - R(d, I')]g T (0)d6
into four pieces which will be bounded separately. First, consider the contribution due to noncoverage
probabilities, namely
(A.6) r 2 !* {[<P(yu 2 (y, 6) - (0/cr)) - 4>(yi*,(y, 0) - (0/tr))]
Jo
- [<P(M1, 0) - (0/o-)) -<*>(«!(!, 0) - (0l<r))]}g T (6)d0.
Using the mean value theorem, we write the { } term in (A.6) as
(A.7) (y-l)G(y*, 6)
where
'dujjy, d)
(A.8) c(y*, e) = 2 (-i)V(r*"f(y*.»)-(»M)
and
y
dy
* +u,-(y*, o)
1 s£y*s£y.
Differentiating (2.20) with respect to y, writing u for u(y, 0), and simplifying, we obtain
<p(u)(duldy) = (llca){(<p(u- (ydla)) -y[(du/dy)- (6I<t)][u- (dla)]<p(u- (6l<r))},
INTERVAL ESTIMATION OF NORMAL MEANS
501
and using (2.20) again,
<p(u) {duldy)=<t>(u){{lly) ~ [(du/dy) - (0/<r)] [u- (Ola)]}
so that
(A.9) (duldy) = {(lly) + (0la)(u-(0l(T))}l{(<p(u)l<i>(u))+{u-(6l<r))}.
Note that r 2 (y-l) < (o- 2 /2),
so that we want to show the boundedness of
/>
(6)G(y*,0)dd
(A. 10)
and we have
(A.11) G(y*, 0) = £ (-iy<p(y*in(y*,d)-(dl*))
{l + («,y*,
y(M,-(y*. 0))
°' <t>(ui(y*,e))
m(y*,d)-y'
\ui(y*,d)+y*£}
- <p(ui(y*,d))
M»i(y*,o))
m(y*,e)-y*
The denominator in (A. 11) is of order (1/ | u | 3 ) as u— > — <*>, so that when the <p(y*u— (0/cr))
term is taken into account, we see that G(y*,»TjT) is bounded as long as each u,(y*, 6) is bounded
above. This is the case if 6 is bounded above, say by A. We next note from Lemma 2.3 that the denomi-
nator in (A.ll) has a limit as 6— > °° and is bounded away from zero for 6 > A. Since (u<p(u)l<t>(u)) —*■ 0,
as u— »• oo and we can write [u+ (y*6/(r)]— [u— (y*0/<r)) +2y*0/cr], where (« — y*0/cr)) is bounded
in the region of interest, we see that for 6 > A, G(y*, 6) is composed of some terms which are uniformly
bounded plus a term of the form
<p(u 2 )
HfwK 7 ** 1 "!
cp(u 2 ) r \
+ (a 1 -y*^)(u 2 -y*^)[<,(y*a 2 -^)-^(y*u 1 -^)]}
{[(*
XT) <P(Ui)j
u-z — y
6\ tpjwi)
(j
<P(u 2 )
where the arguments of U\ and u 2 are suppressed to save space. By Lemma 2.3, for large enough
Ui > d — k, so that %>(ui) — » as 6 increases and the first two terms in (A. 12) are uniformly bounded.
Since («i — y * 0/cr) has a limit by Lemma 2.3 as 6 increases, our attention can be limited to
(A.13) f * 8gr(0)L(y*u 2 -^-v(y*u i -^ye=f~ (|)g r (0){r|^(y*u 2 -£) -<p(y*u, ~)]}dB
which we now examine. Write the { } term as
(A.14) <py(*u,--) T {e-« t »'*(«t-«i)tT*(«i+«iM««r)]_i}.
502 S. BLUMENTHAL
Using Vi denned by (A.l) and the definition (2.5), the exponent in (A.14) becomes
(A.15) ~\ (u 2 -u l )y*[(2do-lr^)+y*(v 1 + v 2 )],
where t < t* < °°. By (2.24), we may take A sufficiently large so that
(A.16) \v x + v 2 \<2 V(2co7y*) e'^* e '^ ) <2 V^ e' 8 ^
and
(A.17) |u 2 -« 1 |<4[21og(y*/co- V2^r)] 1/2 .
If (A.15) is negative, then (A.14) is negative, and (A.13) is bounded above by 0. Consider the set of 6
for which (A.15) is positive, i.e., (v x +v 2 ) < and 0< (20ct/t* 2 ) < — (vi + v 2 ), so that by (A.16) and
(A.17), (A.15) is less than
(A. 18) 4[4co- log (y*/co- V2tt) ] 1/2 y**e-^.
Assume t large enough so that y* < y < 2, then it is clear for A large enough, there is a constant D
say so that (A.14) is less than
(A.19) DTe- e l**<D{Tie),
so that (A.13) is bounded by
(A.20) (* Dg T (0)dd<D.
Putting together the above pieces, we obtain the boundedness of (A.6). Next, we must examine the con-
tributions to (A.5) due to the interval lengths. The first two terms of (2.19) represent
(l/y 2 o-V2^) ye *"* {y dy,
J -oc
which contributes to r(r, I 7 ),
(A.21)
1 )
! X e 2 ' 2 d0
Jo
£j*V- ( -""<fr
TTcry 2 T J
-m
ye 2<r2 dy e
e L2t2 2<ji J dd
0-7T77
INTERVAL ESTIMATION OF NORMAL MEANS
ytt(y)
503
= (TV^/y 2 V^) <t>(yu(y))-
<l»
Vi + (o--7t 2 ]
Vi + (o-7t j )
y'wMy)
e 2(1 + (t2/(t2))
The corresponding term for r(r, l p ) is given by replacing y by 1 in (A.21). When the two terms are
differenced, and (2.25) is used, the contribution to (2.26) will be found by straightforward computations
to be 0(1/t). Next, we examine the term
(2V2o-/y) [" [-logc V^t (aly)<t>(y-<(y+ (8la)))yi><p(y)dy,
which when averaged w.r.t. g T {6) becomes by changing the order of integration
(A.22) (4o-/Vtt(1+ (t 2 /o- 2 ))) J x [-y~ 2 log c \f2^{crly)^{xly)y i2 ^{xly)e-^k^)dx.
A similar expression for the term in t(t, I p ) replaces y by 1 in the square root term, but is otherwise the
same. The contribution to (A.5) is then, omitting constants, writing (1 + (t 2 /ct 2 )) as y 2 /(y 2 — 1), and
writing/) for (cV27rcr),
(A.23) (T 2 (V7=T)/y) f* {[log (l/£><P(*)r/ 2 -(l/y)[log ( y ID® (x/y )]' ' 2 } -4>(*/y)<
J — oo
{jcHy2-l)l->y*\(ij-
Rewrite the <$>(x/y){ } term as
[-logZ)cpU)] I / 2 [cp(x/y)-cpU)] + {(p(x)[-logZ)cp(x)] 1 / 2 -(l/y)(P(x/y)[-log(D/y)cD(x/y)] 1/2 },
and use the mean value theorem to write it as
(y — !)//(*, y*), where 1 < y* < y,
and
H(x, y*) = (l/y* 2 ){-^(x/y*)[-log£>(p(x)] 1 / 2
+ [-\og(Dly*)<S>(xly*)Y l2 -{<l>(xly*) + (xly*)*p(xly*)
+ (l/2)[-log(D/y*)$U/y*)] 1 [(x/y*)-^U/y*)- ( PU/y*)]}}.
It is easily seen that if y =£ 2 (say), using the tail approximations for the normal distribution, that H{x, y*)
is bounded uniformly in x and 1 =£ y* =S 2. Since T 2 (y — 1) is bounded also, we have for some constant
J, that (A.23) is less than
7((Vy
^TT/y) j'
e -[*Hv*-my*\dx= V2^7.
The remaining term in (2.19) contributes to r(r, V) a constant times the integrand of (A.22) inte-
grated over the range (— °°, yu(y)). Break the range into (— °°, u{\)) and (u(l), yu(y)). Over the
504 S. BLUMENTHAL
range (— °°, u(l)), look at the difference written as in (A.23) and it is seen that the same bound applies.
What remains is the integral over (u(l), y{u(y))). In this range, the integrand of (A.22) is bounded
uniformly and by (2.25), the length of the range is 0(t~ 2 ). This completes the proof of Lemma 2.4. Next
we prove Corollary 2.3.
PROOF OF COROLLARY 2.3: To evaluate the limit of (A.5) as t 2 ^°°, we use the first terms of
Taylor expansions in place of mean value theorem approximations, i.e., use
<p(ui(l,d)) / 0M-'
+ Ml, 0)
l*(m(l,0)) V v • ' o-
and
H(x, I) = [-\og D<S>(x)yi 2 {$>(x) +(H2)[-\og D<l>(x)]- 1 (x<p{x) -<P(x))}
in place of G(y*, 6) and H(x, y*), respectively. In (A. 10) substitute = 777 and consider the limit,
HmG(l, tjt)=2co-{[-2 log ccrV2^] 1/2 + [-2 log ccrV2^]- 1/2 }.
Noting that r 2 (y— l)^'(o- 2 /2), we have for the limit of (A. 6),
(A.24) ca 3 {[-2 log co-V2ir"]" 2 + [-2 log caV^]- 1 ' 1 }.
In (A.23), usez = %((Vy 2 -l)/y) and
< A - 25) "S^v^fM
W \_ [ [-log co-V^] 1 ' 2 - (1/2) [-log co- V2^]-'/ 2 ifz>0
v-i V Vy 2 - 1 / 10 ifz<0
to obtain a limit for (A.23) of
(A.26) V2o- 3 {[- log co-V2^] 1 / 2 -(l/2)[- log ccr V2^r]- 1/2 }.
For the difference over the range (— °°, "(1)), note that after transformation, the range is
(-00, B (l)[(Vy r =T)/y]
which approaches (— °°, 0) and noting (A. 25), this contributes nothing. The integral over («(1), y"(y))
is easily seen to be 0(t' 3 ) and contributes negligibly. The net result is that (after multiplying (A.26)
by c) we have (2.27) and the proof of Corollary 2.3 is complete.
REFERENCES
[1] Aitchison, J. and I. R. Dunsmore, "Linear-Loss Interval Estimation of Location and Scale Pa-
rameters," Biometrika, 55, 141-148 (1968).
[2] Bartholomew, D. J., "Contribution to the Discussion of the Paper by C. M. Stein, 'Confidence Sets
for the mean of a Multivariate Normal Distribution,' " J. Roy Statist. Soc. B, 24, 291-292 (1962).
INTERVAL ESTIMATION OF NORMAL MEANS 505
Bartholomew, D. J., "A comparison of Some Bayesian and Frequentist Inferences," Biometrika,
52, 19-35 (1965).
Bickel, P. J. and J. A. Yahav, "Asymptotically Pointwise Optimal Procedures in Sequential
Analysis," Proc. Fifth Berk. Symp. Math. Statist. Prob. (Univ. of California Press, 1965).
Bickel, P. J. and J. A. Yahav, "Some Contributions to the Asymptotic Theory of Bayes Solutions."
Stanford University Dept. of Statistics, Technical Rept. No. 24 (1967).
Bickel, P. J. and J. A. Yahav, "Asymptotically Optimal Bayes and Minimax Procedures in Se-
quential Estimation," Ann. Math. Statist. 39, 442-456 (1968).
Blumenthal, S. and A. Cohen, "Estimation of the Larger Translation Parameter," Ann. Math.
Statist. 39, 502-516 (1968).
Dudewicz, E. J., "Multiple-Decision (Ranking and Selection) Procedures: Estimation," Un-
published Thesis, Cornell University (1969).
Farrell, R., "Estimators of a Location Parameter in the Absolutely Continuous Case," Ann.
Math. Statist. 35, 949-998 (1964).
Feller, W., An Introduction to Probability Theory and Its Applications (J. Wiley, New York,
1957), Vol. 1, 2nd ed.
Ferguson, T. S., Mathematical Statistics, A Decision Theoretic Approach (Academic Press, New
York, 1967).
Joshi, V. M., "Admissibility of the Usual Confidence Sets for the Mean of a Univariate or Bivariate
Normal Population," Ann. Math. Statist. 40, 1042-1067 (1969).
Katz, M. W., "Admissible and Minimax Estimates of Parameters in Truncated Spaces," Ann.
Math. Statist. 32, 136-142 (1961).
Konijn, H., "Some Estimates Which Minimize the Least Upper Bound of a Probability Together
with the Cost of Observation," Ann. Inst. Statist. Math. 7, 143-158 (1956).
Konijn, H., "Minimax Interval Estimates with a Shortness Criterion: A New Formulation,"
Ann. Inst. Statist. Math. 15, 79-81 (1964).
Paulson, E., "Sequential Interval Estimation for the Means of Normal Populations Interval,"
Ann. Math. Statist. 40, 509-516 (1969).
Sacks, J., Generalized Bayes Solutions in Estimation Problems, Ann. Math. Statist. 34, 751-768
(1963).
Stein, C. M., "Confidence Sets for the Mean of a Multivariate Normal Distribution," J. Roy.
Statist., Soc. B, 24, 265-296 (1962).
THE VALUE STATEMENT
M. E. Nightengale
U.S. Air Force Academy
ABSTRACT
Frequently there exists an insufficient amount of actual data upon which to base a
decision. In this paper a method is presented whereby the subjective opinions of a group of
qualified persons are utilized to quantify the relative importance of a finite number of param-
eters or objectives. A means of testing the consistency among the judges is given which
allows the decision maker to determine the validity of the opinions gathered.
The application presented here is in the area of Multiple Incentive Contracting. Namely,
a method is proposed to facilitate the answering of the question, "What is the value to the
purchaser of an incremental change in the performance of a system?" Such a vehicle is not
essential to the methodology proposed.
INTRODUCTION
A current trend in governmental procurement is the use of a form of systems analysis in structuring
Multiple Incentive Contracts. The incentives are structured in such a manner that the government will
obtain a product that results in the minimum total cost for development, procurement, and operation
of the product in the fulfillment of a defined mission. The scale of payments from the buyer to the
seller depends upon an explicit statement of future contingencies; that is, the contractor receives
different rewards for different levels of performance. Where does this explicit statement originate?
The search for an answer to that question is likely to unearth another question; namely, how was the
mission defined? What thoughts, judgments, and opinions went into the definition of the mission?
Notice that the question itself is phrased in subjective terminology. Projections into the future
must necessarily be based on subjective evaluations rather than on quantitatively derived forecasts.
Even if there were some mathematical model available, the input assumptions and the interpretation
of the output are functions of the expertise of the individuals involved.
Since there is no proper theoretical foundation, and since one must rely to some extent on sub-
jective evaluations, there are but two options available to the buyer: (1) he can wait until there exist
sufficiently adequate quantitative methods, or (2) he can make the best of a poor situation and try to
use the opinions of a group of experts in some reliable and beneficial manner. Thus, one must almost
inevitably arrive at the decision to use the best data available — subjective evaluations.
This article will not delve into, nor include a discussion of, incentive contracting and the philosophy
underlying it. The interested reader is invited to read the Air Force Academy Manual, "The Evaluation
and Structuring Techniques of Multiple Incentive Contracts."
Thus far one has seen that the questions of how the rewards are determined and, in fact, how the
mission itself has been defined have led to the search for a method or methods of handling subjective
evaluations. Methods for quantifying such evaluations exist in the literature today. One of the major
problems in structuring incentive contracts has been that the proposed "solution" commences with the
507
508
M. E. NIGHTENGALE
assumption that the fundamental problem has been solved! That is to say, a curve has been drawn,
an equation postulated or some such scheme, and what remains to be accomplished is to manipulate
the tangents or derivatives and arrive at the proper conclusion. The basic problem, however, remains
unsolved.
The specific statement often referred to as "the value statement" is merely a statement of the
value to the government of an incremental increase, or decrease, in some parameter. In multiple
incentive contracting, this "value" is measured in dollars. The problem posed is then one of somehow
assigning these dollar values.
Since this is such a specific problem, it will perhaps be of more interest and value to present here
a method for handling the general case; that is, assigning numerical values to qualitative judgments in
discrete cases. The application of a general method to the specific problem will be readily apparent.
First, consider a discrete problem which may be exemplified by having a group of judiciously selected
experts choose the most desirable parameters to incentivize from among all of the parameters. The
basis for this method of solution is the Law of Comparative Judgment, described by Thurstone [6] in
1927. A detailed explanation and illustrative examples of this method may be found in Nightengale
[4, 5]. An explanation of the computational procedure will be presented here.
COMPUTATIONAL PROCEDURE
The computational procedure associated with the law of comparative judgment applied to ranking
the parameters involves the tabulation of the number of times parameter i is judged more significant
than parameter j. This tabulation is shown for n parameters in Table I.
Table I. Matrix A: Number of Times Parameter i Judged More Significant Than Parameter y
Parameter i
Parameter./
1
2
3 . . .
j . . .
n
1
a r >
Ol3
a,j
a, n
2
a-n
a 23
a-,j
ai„
3
an
a 32
a 3 j
a 3 „
i
a n
a,i
a«
a,j
a,„
n
a„\
a, ,1
<?,!.!
a„j
a,j= the number of times parameter i is judged to be more significant than parameter j, i +j.
It should be noted that ties or indifference judgments are not allowed. This is consistent with the
assumptions involved in Thurstone's Case Five [6, p. 45], upon which this procedure is based. Although
the computations introduced by allowing ties could be easily handled, the law of comparative judgment
provides for only two choices. One should also be aware that if the number of parameters to be judged
is too great, fatigue and indifference will become important factors. The main diagonal of Matrix A
THE VALUE STATEMENT
509
will be left blank because a parameter is not judged against itself. The total number of judgments
required to obtain the matrix is n(n— l)/2.
Matrix P is constructed from Matrix A and is used to display the percentile of times of the total
judgments that parameter i was judged more significant than parameter j (see Table II). Once again
the diagonal will contain zeros. The elements of the P Matrix have the property that p,j + Pji= 1.
TABLE II. Matrix P: Percentile of Times Parameter i Judged More Significant Than Parameter)
Parameter i
Parameter j
Row totals
1
2
3 . . .
j ■ ■ ■
n
1
PV2
Pia
Pu
P\n
Pi
2
P21
P-23
Pi}
P2„
Pi
3
P31
P32
P3J
Pan
P3
i
Pn
Pi2
Pa
Pa
Pin
Pi
n
P,n
P„l
P„3
Pnj
Pn
Po = the percentile of times parameter i is judged to be more significant than parameter./, i =t=y.
Pij=(iijlc, where c is the number of judges.
Matrix Z is used to convert Matrix P into standard measurements of separation in terms of the
equal standard deviations of the discriminal dispersion scale (see Table III).
Table III. Matrix Z: Basic Transformation Matrix
Parameter i
Parameter/
Total
Mean
1
2
3 . . .
j . . .
n
1
Zl2
213
Zij
Z\n
z,
z,
2
221
223
*2j
Ztn
z 2
Zi
3
Zg]
Z32
Z:ij
Zin
2*3
z 3
i
Zil
ZI2
Zi3
Zij
Zin
z,
z,
n
2,il
Zni
Zn3
Ziij
z„
z„
Zij= the separation between parameter i and parameter/ in terms of standard deviations, i =¥j.
Zl . = 2zi ,z,=— •
510 M. E. NIGHTENGALE
The element zy is the standardized normal deviate corresponding to the element p y . The element
Zij will be positive for all values of py greater than 0.50 and negative for all values of py less than 0.50.
Zeros are entered in the diagonal cells. The matrix is skew-symmetric; that is,2 2 i = — zi2- That this
is necessarily so may be shown as follows:
Py = G(z y )= P' j (27r)-" 2 exp {-zV2}dz,
J -30
where zy is the unknown standardized normal varia'te. Since py = 1 —pj>, then
\ Z ' J (27r)-'/ 2 exp {-z-/2}dz=l-( Zjl (27r)-" 2 exp {-z 2 l2}dz
J —x J -x
= f* (2tt)"»/ 2 exp {-z 2 l2}dz = [ *" (2tt)-^ exp {-z 2 /2}dz.
JZji J- 30
Hence, zy=— Zjj.
Matrix Z contains the sample estimates zy of the theoretical values found in the equation of the law
of comparative judgment. The element zy is an estimate then of the difference (Sj — S,) between scale
values of the two stimuli measured in units of the standard deviation of the distribution of discriminal
differences. Each independent element of Matrix Z is an estimate of a value for one equation of the law
of comparative judgment. Z is the average difference between the scale value of the tth stimulus and
the scale value of each of the stimuli with which it was compared. Because of the manner in which the scale
was originally constructed, the Z, values may then be transformed by use of Normal Tables, G(Z,),
into a percentage of the area under the Normal Curve. The G(Z,-) values may then be normalized and
used as the decision-maker deems appropriate.
TESTING THE SCALE VALUES
The reliability of these subjective estimates of the significance of the parameters is dependent
upon the internal consistency of the data. A test exists for checking this internal consistency and con-
sists of starting with the final scale values and working backward to the theoretical percentages de-
manded by them. The discrepancies, designated by Ay, are compared with the mean of the absolute
values of all of the discrepancies. This, then, is assumed to be the deviation about the expected py. The
average deviation is computed by:
VIA I
Average Deviation = — tt^K where N = number of discrepancy calculations.
The average deviation has several advantages: it is easy to define, easy to compute, includes all the
variates, and gives the appropriate weight to extreme cases. The disadvantage of using it is its unsuit-
ability for algebraic manipulations, since signs must be adjusted in its definition. Also, if the means and
mean deviations of two sets of variates are known, there is no ready method for calculating the mean
deviation of the combined set without going back to the original data.
At this point in the analysis, a test for the internal consistency of the judges' estimates has been
stated. The evaluation of the inconsistency calls for a judgment on the part of the analyst as to the sever-
ity of the inconsistency. The possible consequences in terms of human lives, dollars, or whatever units
of value are involved plays a most important role in this judgment. If there are significant discrepancies,
the procedure should be reevaluated to determine the possible causes and possibly point out new or
different means of gathering data. There are several avenues that can be pursued from this point: the
judges can be asked to once again rank the possible parameters; a new group of judges can be consulted
to do the ranking; or, at the worst, the entire analysis can be discarded with the decision maker being
THE VALUE STATEMENT
511
left no worse off than he was when he was operating under complete uncertainty. This assumes that the
time and money invested are not significant.
If, on the other hand, there are no significant discrepancies, a tool exists for the assignment of
significance to the parameters.
HYPOTHETICAL EXAMPLE
As a simple hypothetical example, consider an airplane which the government desires to have
produced. Also, it has been decided that an incentive-type contract will be negotiated with the supplier.
However, exactly which parameters will have incentives placed upon them and the relative importance
of each parameter have not yet been determined.
It is necessary to determine the significance of each parameter. Ten individuals are available
who qualify as experts based on their past experiences, etc.
The parameters are as follows:
S I Airspeed. C III Combat Ceiling.
R II Combat Range. P IV Payload carrying capability.
The ten people were asked to rank the four parameters subjectively in order of significance.
The results are shown in Table IV, where a rank of 1 is best. Matrix A of the computational procedure
for this example is shown in Table V.
Table IV. Rank Orderings Submitted by the Judges
Parameters
Judges
S I
R II
C III
P IV
1
3
2
1
4
2
1
2
3
4
3
3
1
2
4
4
1
2
3
4
5
3
1
2
4
6
3
1
2
4
7
3
2
4
1
8
3
4
1
2
9
2
4
1
3
10
2
1
3
4
I Rank
24
20
22
34
Average Rank
2.4
2.0
2.2
3.4
Table V. Matrix A: Number of Times Parameter i Judged More Significant Than Parameter y
Parameter i
Parameter,;
S I
R II
C III
P IV
S I
X
4
4
8
R II
6
X
7
7
C III
6
3
X
9
P IV
2
3
1
X
512 M. E. NIGHTENGALE
For example, Judge No. 7 ranked the parameters as to their significance in the following order:
P IV
R II
S I
C III
Matrix P and Matrix Z for this example are given in Tables VI and VII.
TABLE VI. Matrix P: Percentage of Times Parameter i Judged More Significant Than Parameter j
Parameter ;
Parameter j
Row totals
S I
R II
C III
P IV
S 1
X
0.400
0.400
0.800
1.600
R II
0.600
X
0.700
0.700
2.000
C III
0.600
0.300
X
0.900
1.800
P IV
0.200
0.300
0.100
X
0.600
Table VII. Matrix Z: Basic Transformation Matrix
Parameter i
Parameter j
Total
Mean (Z)
S I
R II
C III
P IV
S I
-0.25334
-0.25334
0.84161
0.33493
0.08373
R II
0.25334
0.52441
0.52441
1.30216
0.32554
C III
0.25334
-0.52441
1.28155
1.01048
0.25262
P IV
-0.84161
-0.52441
-1.28155
- 2.64757
-0.66189
The necessary values to compute the relative significance are shown in Table VIII.
Table VIII. Calculation of Relative Significance
Parameter i
z
C(Z)
Normalized
relative
Significance
S 1
0.08373
0.53356
0.2647
R 11
0.32554
0.62761
0.3115
C III
0.25262
0.59972
0.2977
P IV
-0.66189
0.25403
0.1261
In order to check the internal consistency of the judges, it is necessary to begin with the final scale
values and reverse the computational procedure. The difference represented by (Z-,—Zj) is transformed
THE VALUE STATEMENT
513
by using the Normal Tables into G(z), which is the percentage of times parameter i theoretically should
have been judged to be more significant than parameter;'.
Table IX. Consistency Check
Z| — Z||
= 0.08373-
0.32554 =
Zi-Zj
P (computed)
-0.24181
0.40447
Z\ — Z|n
= 0.08373 -
0.25262 =
-0.16889
0.43295
Z| — Z| V
= 0.08373-
(-0.66189) =
0.74562
0.77205
Z\\ — Zin
= 0.32554-
0.25262 =
0.07292
0.52906
Z\\ ~z n
= 0.32554-
(-0.66189) =
0.98743
0.83828
Z»i~Z iv
= 0.25262-
(-0.66189) =
0.91451
0.81977
The difference between the computed percentage of times parameter i should have been judged
more significant than parameter j and the observed percentage of times parameter i was judged to
be more significant than parameter j is designated as A,j. Each Ay is compared to the mean of the
absolute values of all of the differences. This is assumed to be the deviation about the expected Pij.
TABLE X. Deviation Calculations
AI-II
= 0.40447-0.400 =
0.00447
A I- HI
= 0.43295-0.400 =
0.03295
AI-IV
= 0.77205-0.800 = -
-0.02795
All— HI
= 0.52906-0.700 = -
-0.17094
AII-IV
= 0.83828-0.700 =
0.13828
A1II-IV
= 0.81977-0.900 = -
- 0.08023
S|A«| =
0.45482
Average deviation =
2|A,,|
N
=
0.45482
6
=
0.07580
The largest absolute discrepancy between observed and calculated values is 0.17094. This dis-
crepancy is less than three average deviations which indicates that the data is reliable; that is, the
judges were consistent among themselves and the assignment of the values was not unjustified.
SUMMARY
The suggested technique presents the decision maker with yet another tool to be utilized in his
quest for assistance. A decision maker with an important decision to make under conditions of un-
certainty would surely welcome any assistance. It has been demonstrated that the method of Rank
Order can be used to change a decision under uncertainty to a decision under risk; that is, from a
decision with no probabilities known to one where some measure of the probabilities involved has been
determined. Morris [3] has a very good coverage of the principles of choice which may be applied to
such decisions.
In cases which qualify as decisions under uncertainty, this method can be applied. If there is
better information available it should be used, but in any case, all the information should be considered.
The use of subjectivities such as the above is discussed extensively in Edwards [1] and in Fishburn [2].
514 M. E. NIGHTENGALE
There are several facets of this technique that deserve further consideration. Perhaps the distri-
bution assumed should be one other than the Normal; perhaps a £ -distribution would be more appro-
priate. This would have to be examined closely by the analyst. In the development of PERT a weighting
factor obtained from the Beta distribution was applied to the estimates because the judges consistently
estimated too low. Maybe a factor of this type would be applicable here.
A record should be maintained, both individually and collectively, of the judges' predictions and
the subsequent results. Then a weighting factor could be introduced later in future judgments based
on past data.
Undoubtedly, there are many possible modifications to the technique presented in this section.
Further study may prove very rewarding since the use of subjective judgments in aiding decision making
is quite an untapped, though potentially rewarding, field.
REFERENCES
[1] Edwards, W., "The Theory of Decision Making," Psychological Bulletin 51, 380-417 (1954).
[2] Fishburn, P. C, Decision and Value Theory (John Wiley and Sons, Inc., New York, 1964).
[3] Morris, W. T., Engineering Economy (Richard D. Irwin, Inc., Homewood, Illinois, 1960).
[4] Nightengale, M. E., "An Approach to Decisions Under Uncertainty," Arizona State University
Industrial Engineering Research Bulletin No. 1 (1965).
[5] Nightengale, M. E., "A Systems Approach to Strategy Formulation When Decisions Must be Made
Under Conditions of Uncertainty," Ph. D. Dissertation, Arizona State University, 1966.
[6] Thurstone, L. L., The Measurement of Values (University of Chicago Press, Chicago, 1959V
A QUEUEING PROCESS WITH VARYING DEGREE OF SERVICE
Meckinley Scott
Mathematics Department
University of Alabama
ABSTRACT
This article describes and presents some results of the analysis of a queueing process
where the degree of service given to a customer is subject to control by the management.
The control doctrine used is based on the queue length and also on the recent history of
the system.
In some queueing situations the total service given to a customer may be considered as consisting
of two or more stages. In fact, in any queueing process for which we can assume that the service time
has an Erlang distribution of order k, then the servicing operation may be regarded as consisting
of A: stages (regardless of whether or not these stages have a physical meaning). When stage 1 of service
is complete the customer passes immediately to stage 2 and so on. There are situations in which it is
necessary to distinguish between the various stages of service, for example, (i) those which are es-
sential and (ii) those which are not. In such a case, from the management point of view, it may be desir-
able to dispense with the non-essential stages of service whenever there is a large accumulation of
customers in the queue.* Thus, the degree (extent) of service given to a customer is no longer fixed,
but a variable subject to control by the management. Since these types of problems have not been dis-
cussed anywhere in literature, the purpose of this note is to introduce and present some results of the
analysis of a simple queueing model of the above character.
DESCRIPTION OF THE MODEL
Consider a queueing process in which the total service is completed in two stages, one to be
followed immediately by the other. Starting from an instant in which the system is empty, the queue
length increases to a certain prescribed number/? for the first time. At this point, the mechanism for the
second stage of service is shut off until the queue length drops to a value r(0 =£ r ^ /? — 1), at which
time both stages of service again become available. When the queue length increases again to R
(irrespective of whether or not, in the mean time, the system became empty) the same process is then
repeated. This policy does not, however, affect a customer whose second stage of service has already
begun. That is, unavailability of the second stage is to be understood in the sense that service of a
customer already in the second stage of service remains uninterrupted until completed regardless
of the number of arrivals for the duration of his service time. Furthermore, it is assumed that unfortunate
customers who receive only the first stage of service do not return later for the second stage.
The above type of control on the service mechanism gives rise to the phenomenon of hysteresis.
The availability of the total service is no longer uniquely determined by the realized queue length.
k In this study the phrase "customers in the queue" is used in the sense that the one being served is also included.
515
516 M. SCOTT
Rather the availability of the total service depends on the previous history of the system, that is,
on the path by which it arrived at the present state.
The waiting room is assumed to have a capacity N(N 2= R) , and the queue discipline is first-come,
first-served. t Customers arrive into the system according to the Poisson law with intensity k. The
lengths of service for the first and second stage are independent random variables each following
exponential distribution with mean durations (llp-i) and U/^2) respectively, and are independent
of the arrival process.
CHARACTERIZATION OF THE STATES IN THE QUEUE
Associated with the service availability at time, t, is an indicator variable, Zt, which assumes say,
a value 1 when the second stage of service is available and a value when it is not. Let Xt denote the
queue size at time t; then from the description of the model above, it is clear that if X t ==£ r then Z(= 1
and if X, 3=/? then Z, = 0, but if r <X t < R then Z t may be either 1 or 0. Also, when at time, t, there
is at least one customer in the queue, let Y t be another indicator variable which assumes say, a value
1 when the customer being served is in the first stage of service and a value 2 when he is in the second
stage. Let Eo and E\ denote, respectively, the sets of states in which there is no customer and at least
one customer in the queue. A typical state of the set E\ may be characterized by means of a triplet
(n, i, j) where n refers to the queue size and i and 7 the values of the indicator variables, Y t and Z t ,
respectively. Of interest, are four mutually exclusive and exhaustive subsets of E x given by, say
Ai-.{(n, i, 1); l^n^R-l}, (i = l, 2),
fii:{(», 1,0); r+l^n^N},
B 2 :{(n, 2,0);R^n^N}.
Let A— A 1 U A 2 and B = B\ U fi 2 • Then A describes the set of states where the second stage of service
is available and B is the set of states in which it is not. As explained earlier, note that unavailability
of the second stage of service does not affect a customer whose second stage of service has already
begun. The states of the set Z? 2 describe this situation. The Flow diagram, Figure 1, helps illustrate
the way in which the queue operates.
STEADY STATE PROBABILITIES AND GENERATING FUNCTIONS
Let pi(0) denote the probability that at time t, there is no customer in the queue. Also, let pt(n, i)
and qt(n, i), (i= 1, 2) denote the probabilities that at time t, there are n customers in the queue and
that the system is in set A\ and set B\, respectively. That is,
p t (n, i) = Pr \X t = n, Y, = i,Z,= l], (i=l,*2; » = 1, 2, . . .,R-1),
q,(n, l) = Pr [X t = n, F,= 1,Z, = 0], (n=r+l,r + 2, . . .,N),
qt(n, 2) = Pr [X t = n, Y, = 2,Z, = 0], (n = R,R + l, . . .,N).
The steady-state probabilities corresponding to Pt(0), pi(n, i) , and qt(n, i) will be denoted by p(0),
tThe mathematical development of the process in this study does not specifically require the assumption of first-come,
first-served queue discipline.
QUEUEING WITH VARYING DEGREE OE SERVICE
517
(r* 1,1,0)
(r + 2,1,0
N-1,1,0) L^vlfN-IAO)
(N.1,0) J ^4(N.2J0)
Figure 1. Flow diagram showing queue operation (in the triplet (n, i,j), n refers to the queue size and i andj to the values of
the indicator variables Y, and Z,, respectively)
p(n, i) , and q(n, i) , respectively. The steady-state equations of the system can easily be derived in
the usual manner and, therefore, will not be presented here. It can be shown that equations pertaining
to sets B\ and Bi yield the following solutions:
(1) q(n,l)= )~ P " [ q(r +1,1), (n = r+l, r+2, . . ., R-l),
(1— Pi)
(2) q(R, D = (1 ( ~_ P ' j- q{r+l,l)- Pl (l-a)p(R-l,2),
(l-p?- r )
(3) q(n, 1) =p»- H V, Hl / q(r+l, 1)
(1-Pi)
+
(a-pi)
[a"-' t + i -p"r' i (a-p i + ap l )]p{R-l,2), (n = R + l,R + 2, . . .,N-1)
(4)
(5)
(6)
q(N,l)=p»-" {1 n P " ' } g(r+l, 1)+ PT [aV-« + '-p i v-«-.(a-p, + ap 1 )]p(«-l,2),
(1— Pi) (a - Pi)
9 (n, 2) = a"-* +, p(/?-l, 2), (n = /?,/? + l, . . ., iV— 1)
q(N,2)=p i a*-' i p(R-l,2),
where
(7)
(8)
p, = X/p, ^ 1,
p> = klp-,,
518 M. SCOTT
(9) a = k/{k + p 2 ) *p u
Throughout the rest of this paper it will be assumed that p\ ¥=■ 1 and a ¥" p\. The cases where pi = 1
and a = p\ can be obtained as limiting cases or by solving the equations directly. Now introduce the
following generating functions:
(10) P(u, i)= £ u*p(n, i), (i=l,2).
n=l
Explicit solutions of the equations pertaining to sets A\ and A 2 seem very difficult to obtain, but it can
be shown that
(id P( "< 0= ^Aur (i=1 ' 2) '
where
A(u, 1) = (ku-k- p*)[ku(l- u)p(0) + ku H+l p(R-l, l)-fi!U r+1 q(r+l, l)]-\fizU R p(R-l, 2),
(12)
b(u,2) = k(ku-k-p i )u K+1 p(R-l,2)-p l [ku(l-u)p(0) + ku R+i p(R-l,\)-piU l + l q(r+l,l)],
(13)
and
(14) A(u) = \ 2 (u-l)(u-U!)(u-u 2 ),
where
(15) ut= [(\ + p, + p 2 )T V(A + p 1 + p 2 ) 2 -4p,p 2 ]/2X, (£ = 1,2),
with U\ having the negative sign in front of the radical.
The probabilities of the sets A x and A 2 can be obtained by considering Eqs. 11 through 15, and
are given by
(16) P(A 1 )=—^—j [p.p(0)+p,(l-p 2 )p(/?-l,2)-(/?-rMr+l, 1)]
and
(17) P(A 2 )=j^-^[p 2 p(0)+p 1 p 2 p(R-l,2)-(R-rHp*lpi)q(r+l, 1)],
respectively, where
p = pi + p 2 ¥■ 1. .
The probabilities of the sets B\ and B 2 can be obtained from Eqs. 1 through 9 and are, respectively,
given by
(19) P(B 1 ) = [(R-r)(l-p 1 )-pt R+1 (l-pf- r )] g n + 1 \V
[ (P.-P2) ,._„„ (a-pi + qpi) ._„„ (l-p 2 ) ]
+ Pl l(a- Pl ) a + (l-p,)(a-p,) Pl (l-p.)J
QUEUEING WITH VARYING DEGREE OF SERVICE
519
and
(20) P(B 2 )=p,p(R- 1,2).
It can also be shown that the expected queue length L say, is given by
(21) L = L(A)+L(B),
where
1
L(A) =
(22)
and
L(B) =
'23)
d-p) 2
(p-piP2)p(0)- ( * r) {pi + piP* + pi+lp(l-.p)(R + r+l)}q{r+l, 1)
pi *■
+ p 1 {p-p l p 2 +R(l-p)}p(R-l,2)
±(R-r){R + r+l)+ ^ |(/?-r)-(l- P r')(/V + r ^)p, v - /< }
g(r+l,l)
+
U-Pi)
/?p2(a-p 1 + p 1 a 2 )+P!(a-pi + ap.)(l-a v - w )+iVp 1 (p,-p2)a' v - ft + 1
(1-Pi)
(a-pi + ap,) R +
1
1-Pi
/V +
r^)>H
p(ft-l,2)
(a-pi)
In fact, L(^) is the contribution to the expected queue length L from the se\.A=A\ U /4-2 and similar
remarks apply to L(B).
So far, all quantities have been expressed in terms oi p(0), p{R — 1, I) , p(R — 1,2) and g(r + 1, 1);
these can be determined by solving the following set of linear equations:
(24)
(25)
(26)
d-p)
P(0)
kp(R-l, l)+Xp(K-l,2)-p,g(r+l, 1)=0,
(u ( -l)p(0) + (u[-uf)p(/?-l,l)
+ «r+
p-ip-i
R-
' (kuj — k — p-,)
p(/?-l,2)=0, (i = l, 2),
p 2 (l-p 2 ) ^ p,(p,-p 2 ) aX _ R+]
(l-pO(l-p) (a-p,)
(q-pi + api) «_ fl+2
P(«-1,2)
(l-p 1 )(a-p.)" 1
1 f («-r)p 2 (l-pf-0 ,._,.,
(1-p,) lp,(l-p) + ' (1-p,)
o(r+l,l) = l.
Equations 24 and 25 above are derived from the condition that in P(u, 1) the zeros of the numerator
coincide with the zeros of the denominator* and equation 26 is obtained from the normalizing condition
for the sum of all probabilities.
""An equivalent set of equations is obtained by considering t*(u, 2). Equation 24 may also be obtained from equations of the
set B.
520
M. SCOTT
Tables 1 through 4 give some numerical computations of p(0), P(Ai), P(Bt), (i=l, 2) and L.
Table 1. yv=10, R = 6, a = 1, /x, = 2, /jl 2 = 3
r
P(0)
P(A t )
P(A 2 )
P(B>)
P(Bt)
L
0.2469
0.3806
0.2524
0.1188
0.0013
2.1038
1
0.2361
0.3968
0.2631
0.1026
0.0014
2.1510
2
0.2265
0.4113
0.2727
0.0881
0.0015
2.2282
3
0.2178
0.4244
0.2812
0.0749
0.0017
2.3339
4
0.2095
0.4368
0.2891
0.0624
0.0021
2.4736
5
0.2012
0.4494
0.2967
0.0497
0.0029
2.6667
Table 2. N= 10, R = 7, X=l, ^ = 2, /a 2 = 3
r
P(0)
PG4.)
P(A 2 )
P(B t )
P(B 2 )
L
0.2321
0.4031
0.2678
0.0961
0.0009
2.2551
1
0.2243
0.4149
0.2756
0.0843
0.0010
2.2920
2
0.2172
0.4256
0.2827
0.0735
0.0010
2.3516
3
0.2107
0.4354
0.2891
0.0637
0.0011
2.4315
4
0.2047
0.4445
0.2950
0.0545
0.0013
2.5320
5
0.1990
0.4533
0.3006
0.0456
0.0016
2.6585
6
0.1930
0.4624
0.3060
0.0364
0.0022
2.8285
Table 3. N=12,R = 6, X= 1, ^ = .95, /jl 2 = 10
r
P(0)
P(A t )
P(4t)
P(B t )
P(B 2 )
L
0.0450
0.1308
0.0124
0.8118
0.0000
6.8329
1
0.0425
0.1619
0.0154
0.7802
0.0000
6.8613
2
0.0403
0.1923
0.0182
0.7492
0.0000
6.9021
3
0.0384
0.2221
0.0211
0.7184
0.0000
6.9533
4
0.0367
0.2517
0.0239
0.6877
0.0000
7.0153
5
0.0353
0.2810
0.0266
0.6570
0.0001
7.0921
Table 4. N
= 12,/? =
= 7, k =
1, /LI, = .95, fJL2 = 10
T
p(0)
P(Ax)
P(A-i)
P(Bi)
P(B-i)
L
0.0436
0.1552
0.0147
0.7865
0.0000
6.8743
1
0.0411
0.1870
0.0178
0.7541
0.0000
6.9074
2
0.0389
0.2182
0.0207
0.7221
0.0000
6.9521
3
0.0370
0.2491
0.0236
0.6902
0.0000
7.0069
4
0.0354
0.2799
0.0266
. 0.6582
0.0000
7.0701
5
0.0339
0.3107
0.0295
0.6258
0.0000
7.1421
6
0.0327
0.3418
0.0324
0.5930
0.0001
7.2271
AVAILABILITY OF SECOND STAGE OF SERVICE
For the queueing process under consideration it is clear that the most interesting characteristics
are those concerned with the availability or unavailability of the second stage of service. The proba-
bility of the set B given by
QUEUEING WITH VARYING DEGREE OF SERVICE 521
(27) P(B)=P(B i )+P(B 2 )
may be interpreted as the fraction of time the second stage of service is unavailable, whereas
(28) \ - P(B) = P (0) + P(A)
= p(0)+P(A 1 )+P(A 2 ),
is the fraction of time it is available. In some instances, it may also be important to know how often the
mechanism for the second stage of service is switched off or switched on. Let / denote the expected
frequency per unit time the mechanism is switched off. Under steady-state conditions, / also equals
the expected frequency per unit time the mechanism is switched on. In fact, Eq. 24 reflects this property
of the system. We have then
(29) I^mqir+1, 1)
= X{p(R-l, 1) +/>(/?- 1,2)}.
A FIRST-PASSAGE TIME PROBLEM
This section deals with the distribution of time that elapses before the system enters the set A
starting initially from one of the states in the set B. Let f n ,i(t) , (i=l,2) denote the probability density
function (p.d.f.) of the required distribution, given that initially the system was in a state (n, i, 0) of the
set B. A little reflection shows that//} >2 (0 is identical with the p.d.f. of the time that elapses from the
instant the system enters the set B from the set A until it reenters the set A (that is, from the instant
the service mechanism for the second stage is switched off until it is switched on again), given that
entry into the set B is made through a state (R,2,0). Similar remarks apply to the p.d.f. /r, i (r). A
little further reflection shows that/„,i(0, (n^r+1) is the same as the distribution of time, in an
M|M| 1 queue with capacity N, that elapses before the queue size drops to a value r starting initially
from a value n. If we denote by f*,i(s) for the Laplace transform of the p.d.f. f n , ,-(0, then using the
same kind of argument as in (4), it can be shown that
(30) f* tl (s) = kia? + k 2 a%, (n=r+l,r+2, . . .,/V),
(31) f* 2 (s) = 6, A.ar 1 — tHH + fe«V — rHr^
[ 1 — biOCi l — biOCi J
+ b 3 bg- n {kia*>- 1 + k 2 a$- 1 }, (n=R, R + l, . . .,/V),
where
(32) a 1 = —{(k + fji l + s) + V(\ + /Ltl + 5)2-4X^1,},
(33) a 2 = ^{(\ + /Lt, + s)- V( \ + M , + s) 2 -4\/li,},
(34) b l = f JL>/(k + (JL2 + s),
(35) b 2 = kl(\ + fjL 2 + s),
(36) b, = fji 2 l(vL 2 + s ),
522 M. SCOTT
c 1 af-'"Ha i -c 3 )
(37) kt =
(38) fa =
a 2 r+1 { a, v - r - 2 ( 1 - c 2 a 2 ) (o, - c 3 ) + a% ~ r ~ 2 ( 1 - c 2 a, ) (c 3 - a 2 ) } '
Cia 2 v '~ r_2 (c 3 — a 2 )
aJ +1 {a'J r_r " 2 (l - c 2 « 2 ) (a, - c 3 ) + a£- r - 2 (l - c 2 «i) (c 3 - a 2 )} '
(39) Ci = /i 1 /(a.+ ^ + s),
(40) c 2 = A/(A + ^, + s),
(41) c 3 = At,/(Ati + 5).
The expected value E{T„,i} say, corresponding to the p.d.f. f n ,i{t) is given by
(42) E{T n ,i}= } n ~ r \ - 1 ^-— {(\//x,)^-"-(X/Mi)' v - r }. (n = r,r+l,. . . , /V) , and
(/Lti — A.) (/Ai — \) 2
(43) F{r„, 2 } = £{r„_ 1)1 }+- — ^--{(/x,/^) -7^—-^ -(x//x.) v -" +1 ]
+ ^+Aif k - i](-A_f +1 f (»=*,* + ! N).
(/Lti — A.) LjLt!(\ + jLl2 — jLti) /Lt-jJ \K + fJ.2/
Notice that when A and /Xi are fixed and /i-2 - *■ °°, then £{^,2}^ E{T„^i, 1}, a result which is intui-
tively obvious. Furthermore, when />ti = /1.2 then ^ITn^} = E{T n , 1}, (n = R,R + 1, . . .,N).
DISCUSSION
In this paper, we have presented expressions for several important characteristics of a queueing
process where the degree of service given to a customer is subject to control by the decision maker.
A knowledge of these characteristics enables us, in principle, to construct a reasonable objective
function based on costs originating from three sources: (a) a cost proportional to the queueing times of
customers, (b) costs associated with the two stages of service, and (c) costs associated with switching
the mechanism for the second stage of service. In any given situation where the parameters of the serv-
ice times and arrival time distributions are fixed and the objective function defined, the value of the
objective function may be computed for any given set of the control variables r, R and N. Needless to
say, the problem of deriving optimal sets of the control variables is one of considerable complexity
and we will not pursue this subject any further.
In a certain sense, the model discussed in this paper may also be considered as a queueing process
involving a variable number of servers when it is assumed that the two stages of service are operated
by two different servers. Queueing processes involving a variable number of servers have received
some attention in literature, see for example (2); however, in those models every customer is given the
same degree of service and two or more customers may be served simultaneously, whereas this is not
so in the present case.
REFERENCES
[1] Cox, D. R. and W. L. Smith, Queues (Methuen and Wiley, 1961).
[2] Moder, J. J. and C. R. Phillips, Jr., "Queueing with Fixed and Variable Channels," Operations
Research, Vol. 10, No. 2 (Mar-Apr., 1962).
[3] Morse, P. M., Queues, Inventories and Maintenance (John Wiley and Sons, New York, 1958).
QUEUEING WITH VARYING DEGREE OF SERVIGE 523
[4] Naor, P., "On a First-Passage Time Problem in Chemical Engineering," University of North
Carolina, Institute of Statistics, Mimeo Series No. 400 (Aug. 1964).
[5] Saaty, T. L., Elements of Queueing Theory (McGraw-Hill, New York, 1961).
[6] Yadin, M. and P. Naor, "On Queueing Systems with Variable Service Capacities," Nav. Res.
Log. Quart. 14,43-53(1967).
MAD: MATHEMATICAL ANALYSIS OF DOWNTIME
Edward B. Brandt
The RAND Corporation
Washington, D.C.
and
Dilip R. Limaye
Decision Sciences Corporation
Jenkintown, Pa.
ABSTRACT
The MAD model presents a mathematical treatment of the relationship between air-
craft reliability and maintainability, system manning and inspection policies, scheduling
and sortie length, and aircraft downtime. Log normal distributions are postulated for sub-
system repair times and simultaneous repair of malfunctions is assumed. The aircraft
downtime for maintenance is computed with the distribution of the largest of k log normal
distributions. Waiting time for maintenance men is calculated either by using a multiple-
channel queuing model or by generating the distribution of the number of maintenance
men required and comparing this to the number of men available to determine the proba-
bility of waiting at each inspection.
INTRODUCTION
The maintenance characteristics of alternative aircraft designs should be properly represented
in cost-effectiveness trade studies through their impact on both cost and effectiveness. The cost impact of
greater maintenance requirements is recognized through higher manning tables, increased costs of
ground support equipment, etc. However, maintenance also influences measures of effectiveness.
The traditional approach in aircraft reliability analyses relates malfunctions to flight hours. Because
downtime is influenced partly by the number of malfunctions and the number of malfunctions depends
on utilization, availability decreases with higher utilization. Therefore, downtime determines either
the utilization consistent with a specified availability or the availability consistent with a specified
utilization. Both of these measures are important parameters in analyses of effectiveness.
The MAD model relates the downtime of an aircraft to the component or subsystem repair times,
maintenance occurrences, inspection philosophy, flight scheduling, sortie length, and the number
of men assigned to the maintenance facility.
The model can be used to:
• Determine the dowtime implications of changes in reliability (R) and maintainability (M).
• Evaluate the impact of additional maintenance men on aircraft waiting time and determine
the optimum number of maintenance men.
*Research on this paper was performed by the authors at the Vertol Division, The Boeing Company, Philadelphia, Pa.
525
526
E. B. BRANDT AND D. R. LIMAYE
• Determine the effect of alternative inspection philosophies on downtime. In particular, the
model is sensitive to the flight hours between inspections and the amount of deferred maintenance
for each type of inspection.
• Evaluate the surge capability of the system; i.e., the ability to fly more than usual for a given
period of time.
GENERAL DESCRIPTION OF MODEL
The model comprises several subroutines. The overall flow diagram of the model is shown in Figure 1.
ELAPSED MAINTENANCE TIMES
PER ACTION
MAINTENANCE ACTION RATES
FLIGHT HOURS BETWEEN
INSPECTIONS
NUMBER OF INSPECTIONS
PROBABILITY Or DETECTING
FAILURE AT INSPECTION
NUMBER OF MAINTENANCE MEN
NUMBER OF OPERATIONAL AIRCRAFT
NORS DOWNTIME (D )
ADMINISTRATIVE DELAYS (D„)
AVAILABILITY
DOWNTIME PER FLIGHT HOUR
NUMBER OF WORKING HOURS
IN A MAINTENANCE DAY
NOT OPERATIONALLY READY
SPARES
U = UTILIZATION
UTILIZATION
PROBABILITY OF K
MAINTENANCE ACTIONS
AT Ith INSPECTION
PR(K,I) ,1=1, . ,N
DOWNTIME PER
FLIGHT HOUR
(D=D„*D, >D, ♦ D.+D.,
UTILIZATION/
AVAILABILITY
A = f (U,D,m)
EXPECTED MAINTENANCE
DOWNTIME DUE
TO LONGEST OF
K SIMULTANEOUS
MAINTENANCE ACTIONS
DOWNTIME IN
LOOK PHASE OF
INSPECTION FOR
EACH TYPE OF
INSPECTION (D, )
AVAILABILITY
AIRCRAFT DOWN-
TIME DUE TO
MAINTENANCE
ACTIONS (D„)
PROBABILITY OF
WAITING DUE TO
INSUFFICIENT NUMBER
OF MAINTENANCE MEN
EXPECTED
DOWNTIME
DUE TO
WAITING (D )
Figure 1. MAD model flow chart
SPECIFIC SUBROUTINES
Probability Distribution of Maintenance Actions
The model is formulated under the assumption that most failures are repaired at inspections.
In other words, the aircraft is not considered to abort its mission and immediately return to base
for every failure. Failures causing a mission abort are treated separately and assumed to be repaired
individually. All other maintenance actions are located and repaired at scheduled intervals such as
preflight, postflight, daily, and periodic inspections; multiple actions can be repaired simultaneously
when the required maintenance men are available.
The Poisson distribution describes the number of failures occurring at each inspection. The funda-
mental assumption underlying the use of a Poisson distribution is that individual failures are inde-
pendent. Given independence, it can be proved that, for a complex system of many components, the
MATHEMATICAL ANALYSIS OF DOWNTIME 527
probability distribution of the time between actions is exponential, regardless of the individual failure
law for each component [1]: The distribution of the number of actions in a specified time period is
Poisson when the time between actions is exponential.
Therefore, the probability of x failures occurring at the ith inspection is given by
Hj(x) = ; ,
where A = failure rate (occurrences per flight hour),
and U = flight hours between inspections (i — 1) and i.
It is not assumed that all inspections are so thorough that every failure will be detected and then
repaired. For example, a certain inspection may not check for transmission leaks or hydraulic pressure.
Additionally, some failures are not flight-critical and are deferred to later inspections. All failures not
repaired at the inspection directly after their occurrence are considered undetected or deferred. These
failures are noted and examined at succeeding inspections until there is a major inspection in the
inspection cycle, at which time all previously undetected actions are detected and repaired.
The probability of repair at the ith inspection (Pi) can be derived either empirically or on the basis
of the inspected components and their relative occurrence. The probability distribution function of
repairing k failures at the ith inspection, given that x have occurred, is binomial:
(1) Pi(kix) = ( x k )p*(i-Pi)*-K
Because the distribution of failures occurring is Poisson, the marginal distribution of the number
of actions repaired at the ith inspection is
(2) P i (k) = ^P i (klx)P i (x)
x = k
{Pi\ti) k e-''^'i
k\
The distribution of the number of maintenance actions repaired at the ith inspection is therefore
Poisson with mean Pjktj.
The probability distribution of the number of failures that are not repaired is Poisson with mean
(1 — Pi)\tj. In the notation defined above, the unrepaired failures for the first inspection (U\) are
(3) £/, = (^-jA) *>'■
where rh\ = expected number of actions repaired at inspection 1 = P\kt\.
528 E. B. BRANDT AND D. R. LIMAYE
The available actions at the second inspection include the unrepaired failures from the first
inspection plus any failures occurring between the first and second. The probability of repair at the
second inspection is applied to all available actions. Therefore,
2 = Pi | [ p ) m l + kt *
(4a)
In general,
(4b)
mi
and £/ 2 = ( — n — ) m *-
m» = P n \ ( — p — — | m„-i f \t„
and U„ = ( —p — - ) nin.
Consider a constant injection of new actions occurring at a rate X per inspection and a constant prob-
ability of repair P at each inspection. The expected number of unrepaired failures at the nth inspection
U n is given by:
(4c) Un=f,a-pyx-
2=1
As n becomes large,
(4d) Lim Un=(^r-)x 0<P^l,
and the expected number of repaired actions is
(4e) Limm n =pl(j : jr~)x+x\=X.
In cases where there is a major inspection (re+1) in the inspection cycle where all previously unde-
tected actions are detected and repaired, we have P n +i = 1 and therefore,
TOn+l = j( — p — " j m„+A.i*„ + i >
and £/„ + ! = ( _ J m n + l = 0.
The succeeding (n+1) inspections will then follow the same pattern as inspections 1 to (ra+1), and
thus a steady-state cyclical pattern evolves.
Equations (4a) through (4f) imply an assumption that an undetected failure is as likely to be detected
in a succeeding inspection as a new failure. This assumption is not critical to the computation of down-
time. When the total number of inspections in the inspection cycle is small and the probabilities Pi are
large, the total aircraft downtime calculation is not significantly affected if an alternative assumption
is made that a failure not detected at an inspection remains undetected until the major inspection.
(4h) U 2 =(^- r J -)m 2 + (^- 5 -^)fh l ,
MATHEMATICAL ANALYSIS OF DOWNTIME 529
In the latter case, the equations for the undetected failures and repaired actions will be given by:
(4g) m, = P. 2 \t 2 ,
>.
(4k) m„ = P„\t n ,
(4 P ) U "=l( 1 -^ 1 )^
(4q) m n + i = A.t„ + i+ ^ I J™-],
(4r) U n+l = 0.
Distribution of the Longest Maintenance Action
The aircraft is out of commission for only the longest individual repair time when maintenance
items are repaired simultaneously. It is assumed that the only limitation to simultaneous repair results
from insufficient manpower; this is accounted for in maintenance waiting times. Downtime from active
maintenance at each inspection (Dj) is given by
(5) Di=2 E>-(k)Pi(k),
k = l
where El(Ic) — expected repair time of the longest of A; simultaneous maintenance actions.
The probability distribution of maintenance actions Pi(k) is given by Eq. (2). The expected value of
the longest repair time is
(6)
Edk)=j\f y [F{y)ydy = j~yk[F(y)Y->f(y)dy,
where F(t ) = the cumulative distribution function of the time to repair a single action.
A numerical solution can be easily obtained if one assumes that each subsystem or component will
always be repaired in the same time. However, this assumption is unrealistic, especially when higher
levels of agregation are necessary. The chosen method of solution uses a continuous probability density
function to describe repair times. By this method, the variability within as well as between subsystems
can be properly represented.
The log normal distribution was selected to represent the times to repair for the following reasons:
1. The log normal is a flexible, two-parameter distribution which is skewed to the right. The
probability of zero repair time is always zero.
2. The log normal distribution has been shown to fit well to reaction times of humans to more
complicated perceptual patterns involving some degree of learning [2].
3. The actual distributions of repair times for individual systems and the entire aircraft compare
favorably to the log normal for both helicopters and fixed-wing aircraft.
530 E. B. BRANDT AND D. R. LIMAYE
The log normal distribution is given by
where cr= standard deviation of y, and
y,„ = median of y.
If Yu yii • • • •» y» are samples from a log normal distribution, the distribution of the largest,
Y*{=MAX(yi,y>, . . . ., y„)},is given by
(8) fr*(y) = n[F Y (y)]»-*fy(y),
where
(9) Fy(y') = I "' — \j= exp{--^ ln« ( X ) \dy.
Jo yo"V27r I 2o- \y m /J
Substituting this result in (6), we get
(10) E L (k)= (' yfy*(y)dy= [" yA[Fy(y) ]*-/»■ (y)dy,
Jo Jo
where fy(y) and Fy(y) are given by (7) and (9), respectively.
The integral in (10) is not easy to evaluate; however, Gumbel [3] has derived approximate results for
different values of A;. These results were used in the model.
Maintenance Waiting Times
This category accounts for the additional downtime resulting from the inability to achieve com-
plete maintenance concurrency because of insufficient manpower. It is obvious that waiting times are
influenced significantly by the distribution of failure occurrences. Poor planning of the assignment of
aircraft or erratic demands for their use can lead to substantial waiting in one period and idle time
in another.
Two approaches have been programmed with the hope of obtaining a most probable value and a
lower limit for maintenance waiting time. The approach which yields the most probable value assumes
that aircraft will arrive simultaneously at the maintenance facility with the requirement that the
combined actions be processed at once. The second approach assumes perfect aircraft scheduling in
which the expected number of actions arriving at the maintenance facility is constant per unit time
period.
Bulk Arrival of Maintenance Actions — Each aircraft is assumed to go through the identical series
of inspections throughout the entire inspection cycle. The inspection cycle may last for a number of
periods, after which the aircraft is processed through a major inspection and is considered returned
to a new condition. When many aircraft are assigned to the same facility, the combined maintenance
actions must be processed together to avoid delays. Each aircraft is assigned a position in the inspection
cycle. The distribution for the combined actions is Poisson with mean equal to
n A
(11) m T = ^ niy,
MATHEMATICAL ANALYSIS OF DOWNTIME 531
where fhij = expected number of actions for aircraft j at inspection i (see Eqs. (4a) and (4b)), and
Na — total number of aircraft.
It is now necessary to find the distribution of the number of maintenance men required, given that k
actions are to be repaired simultaneously. Consider the aircraft as a collection of tasks each requiring
a particular number of men. It is possible to construct a distribution of the number of men per task.
By employing the central limit theorem, the distribution of the average number of men required is
normal, given that the number of actions is large. The central limit theorem is applicable here because
the region in which manpower shortage exists occurs only at high values of k. Given k actions, the
distribution of the total number of men required can be assumed normal with mean ix m , k and standard
deviation <r m ,A-, where
IJL m , k = kE(A), and
tTm.k — O-A'Vk,
where E(A) — expected value of number of men per task, and
(j a — standard deviation of number of men per task.
The aircraft must wait if the number of men required (M) exceeds the number available (Mo). There-
fore, the probability of waiting given that A actions have occurred is
(12) P w {k)=P[M > M \k] = ( X f M {m)dm
JMo
= = exp -- \dm.
0- m ,A-V277- JMo I z \ (Tm,k I J
The total probability of waiting is
P, = ^P w (k)P(k)
k
kf k\ (r m ,kV2^ I 2\ a m ,k J J
This probability can be evaluated at each inspection. To determine the expected waiting time, it is
assumed that the repair time for a task is independent of the manloading. Then the waiting time when
the number of men required is m is given by
(14) W(m) =
where ^r= average elapsed repair time.
The expected waiting time is then given by
■2.-1
M
Rt ,
( 15 ) g[r<»)]- f 2 (m *'~" fir-i]«-— J -r"t \A[*=***) t W
532 E. B. BRANDT AND D. R. LIMAYE
Continuous Arrivals of Maintenance Actions. This approach uses the standard multiple-channel
waiting-time model. The required input information is the arrival rate, service time, and the number of
channels. For our problem, the maintenance actions represent the arrivals; the repair time is the service
time; and the number of maintenance teams is the channels.
The arrival rate is a function of the occurrences per flight hour, flight hours per unit time period,
number of aircraft assigned to maintenance facility, and the total maintenance working hours (clock
time) per unit time period. The number of channels is equal to the total men assigned to the maintenance
facility divided by the average men per action. The service time distribution is given by the distribution
of elapsed repair times.
The main advantages of the queuing theory approach are its simplicity and the availability of
computational procedures to determine solutions. The main disadvantage is the assumptions that are
forced upon the problem in order to facilitate the mathematical solution.
These assumptions are as follows:
1. Arrival rate — The expected number of actions arriving at the maintenance facility is constant
per unit time interval. This assumes nearly perfect scheduling which ensures a constant demand rate
for maintenance men.
The probability distribution of the time between arrivals is exponential.
2. Service time — The expected service time for an action is equal to the average time to repair
(ATTR).
The probability distribution of repair times is assumed to be exponential.
3. Numbers of channels — The number of men per task is constant.
Under these assumptions, the standard multiple-channel queuing model is applicable and the
expected value of waiting time can be easily determined.
Let A. = arrival rate,
/x — mean service time,
s = number of channels,
then the expected waiting time is given by Ref. [4]:
E((o) =~
sld-p) '
where n , A. ,
P — density = — , and
fJLS
P = probability that zero teams are working
-Y (-
2^r +
;! »!(1-P)
MATHEMATICAL ANALYSIS OF DOWNTIME
533
Comparison of the Two Methods. The bulk-arrival method is the more realistic representation of
actual conditions. Furthermore, it is compatible with the assumptions made in other sections of the
model because it considers the sortie length and the distribution of men per task. The major disad-
vantage is that the number of working hours between aircraft arrivals is required as input.
The queuing-theory approach depicts a highly optimistic system wherein maintenance actions are
arriving continuously at a constant rate. These assumptions contradict the maintenance downtime
routine which assumes the simultaneous repair of tasks when the aircraft undergoes an inspection.
Furthermore, the service time distribution is considered exponentially distributed with a constant
manloading while the maintenance downtime routine assumes log normal service times with variable
manloading. The queuing-theory approach can be refined by assuming other distributions of service
times, but then the computational procedures become vary complex.
RELATION BETWEEN AVAILABILITY, UTILIZATION, AND DOWNTIME
The relationship between availability and utilization has been traditionally expressed as shown
in Figure 2. This approach assumes downtime per flight hour to be fixed.
AVERAGE
UTILIZATION
PER DAY
DOWNTIME = d,
DOWNTIME = d.
d 2> d l
AVAILABILITY
Figure 2. Traditional A/U plots
The results of the MAD model show a somewhat different picture. Availability no longer responds
linearly to average daily utilization. This occurs because some inspections are on a calendar basis, or
before and after each flight. When utilization increases, downtime per flight hour increases and forces
the curve to bend back because of the additional waiting time accompanying the longer flight program
(see Figure 3). The curves designated T\, T 2 , r 3 ,andr 4 represent larger manning levels (T4>T 3 >T2>Ti).
AVERAGE
UTILIZATION
PER DAY
AVAILABILITY
Figure 3. MAD A/U plots
SUMMARY
The MAD model incorporates a new approach for the determination of the interrelationship
between aircraft malfunction rates and repair times, inspection philosophy, and number of men as-
signed to the maintenance facility; and between availability, utilization, and downtime. The model is
therefore suitable for evaluating the cost-effectiveness implications of changes in subsystem failure
rates and maintenance times. It can also evaluate different inspection philosophies (such as flight-hour
inspections versus calendar-day inspections) and determine the optimum number of men required
534 E. B. BRANDT AND D. R. LIMAYE
to achieve a specified level of availability and utilization. Such analyses have traditionally been per-
formed with detailed simulations of the maintenance environment incorporating Monte Carlo tech-
niques. In the MAD model, attempts have been made to represent the significant characteristics of the
real-world situation as accurately as possible using probability distributions. The model is thus simple
and easy to use, and requires significantly lower running time than Monte Carlo simulations. The
level of detail in this model makes it suitable for tradeoffs and sensitivity analyses in the concept-
formulation and early contract-definition phases.
The MAD model has been programmed and is operational on an IBM 360/65. It has been validated
by testing it against actual maintenance data from Army CH-47 helicopters at Fort Rucker, Alabama.
The model is now being used for subsystem tradeoffs and sensitivity analyses for the HLH (Heavy-Lift
Helicopter) and LIT (Light Intratheater Transport) Programs.
REFERENCES
[1] Drenick, R. F., "The Failure Law of Complex Equipment," J. Soc. Indus, and App. Math. (Dec.
1960, pp. 680-690.
[2] Goldman, A. S., and T. B. Slattery, Maintainability: A Major Element of System Effectiveness
(John Wiley and Sons, New York, 1967).
[3] Gumbel, E. J., Statistics of Extremes (Columbia University Press, New York, 1960).
[4] Hillier, F. S. and G. J. Lieberman, Introduction To Operations Research (Holden Day, 1967).
A METHODOLOGY FOR ESTIMATING EXPECTED USAGE OF REPAIR
PARTS WITH APPLICATION TO PARTS WITH NO USAGE HISTORY
Sheldon E. Haber and Kosedith Sitgreaves*
The George Washington University
School of Engineering and Applied Science
Institute for Management Science and Engineering
ABSTRACT
In this paper a model is presented which focuses on the difficult problem of pre-
dicting demands for items with extremely low usage rates. These form the bulk of repair
parts in military systems. The basic notion underlying the model is the pooling of usage data
for common design items with movement for the purpose of estimating usage rates for similar
items which have shown no movement.
A unique feature of the model is that it also makes possible the estimation of usage rates
for items newly introduced into a system for which no previous usage history is available.
0. INTRODUCTION
The problem of predicting demands for individual repair parts in military inventory systems has
received much attention over the last two decades. This problem is a complicated one because of the
sporadic nature of demands for military repair parts. For most repair parts, no demands are registered
over long periods of time and when items are demanded, they are generally demanded only once or
twice. This fact has now been documented by many usage studies t and is once again documented
in this study. To illustrate the nature of the demand problem under consideration, usage data for 61
submarine patrols are shown in Table 1. As may be seen from the first entry in the table, no usage
TABLE 1. Distribution of 25,138 Different Repair Parts by the Number of Patrols in Which They
Were Demanded
Number of
Frequency of
patrols
different parts
21,597
1
1,776
2
673
3
333
4
199
5
134
6
96
7
81
8
62
9
32
10
28
11-61
127
Total
25,138
*Columbia University.
tFor a review of this literature, see Henry Solomon. "A Summary of Logistics Research Project's Experience With Problems
of Demand Prediction" in [1 ].
535
536 S. E. HABER AND R. SITGREAVES
was recorded for the vast majority of items, i.e., the vast majority of items was not demanded in any
one of the 61 patrols. Of those that were demanded, one-half were demanded in exactly one patrol.
Thus for most repair parts with usage, the problem of estimating usage rates is a difficult one. This
difficulty is compounded by an order of magnitude for the bulk of the items whose usage history shows
zero units used.
In this paper we will be concerned with the estimation of expected usage of repair parts for the
purpose of computing shipboard allowance lists. A shipboard allowance list is defined as the range
and depth of repair parts to be stocked aboard ship to meet uncertain demands. The range of repair
parts refers to the number of different items to be stocked. The depth refers to the number of units
stocked of an item.
Given that repair part usage is sporadic, several demand prediction strategies are available. The
most widely practiced approach is that of employing usage estimates provided by technicians, i.e.,
supply personnel responsible for provisioning of repair parts. In practice, these have been found to
be conservative and lead to relatively expensive stockage lists. Such conservatism, however, is pre-
ferred to a much more extreme approach which might assign a zero usage estimate to a repair part
until positive usage is experienced. The difficulty with this latter approach is that many repair parts
are only one-time movers. Failure to have an adequate quantity of stock aboard ship or in the supply
system prior to the first demand can thus lead to a large range of shortages and an unsatisfactory level
of readiness.
Another approach that has been utilized to estimate usage of slow moving items is exponential
smoothing [2]. In this procedure, a technician's usage estimate is generally employed as an initial esti-
mate. Hence, the initial procurement for repair parts will be based solely on the technician's estimate
and will thus be subject to the limitation already noted.*
One procedure for the problem at hand is to utilize information not directly pertaining to the repair
part being considered.! This is the approach of this paper. The information used pertains to the class
of repair parts of which the given repair part is a member. It is assumed that usage data are available
for the repair part class and that some of the items in that class have shown movement in the past. The
advantage of this procedure is that it permits the pooling of demands where they have occurred and
the use of this information for making positive usage estimates for items for which zero usage has been
recorded. The procedure also provides an expected usage estimate for new items being introduced
into the inventory system for which no usage history is available.
The criterion used in this study for defining repair part class is that of nomenclature of which
resistor, washer, motor, and valve are examples. It should be noted that within a given class, estimated
usage rates will vary depending on the design, environment, location, etc., of each part. The usefulness
of partitioning by nomenclature rests on the assumption that variations in usage rates within a given
nomenclature class will be less than that among classes. In this case, the stratification of repair items
should reduce the variance of the estimates vis-a-vis the alternative of not distinguishing items by
nomenclature class.
In the next section we present a theoretical model for estimating expected usage of repair parts,
in particular repair parts with no usage history. Following a description of the model, goodness-of-fit
*Furthermore, for zero-movers this procedure will quickly lead to usage estimates that are indistinguishable from zero.
tA model of this type, which uses information pertaining to the failure behavior of the part's parent component, is described
in [5].
ESTIMATING USAGE OF REPAIR PARTS 537
tests are applied. In the final section, the model is evaluated by developing alternative allowance lists
and comparing these lists against actual submarine usage data.
1. THE PROBABILITY MODEL
We consider a class C of items defined, for example, in terms of nomenclature. Let part / be any
item classified as belonging to class C, and let y/ = 0, 1, 2, . . ., represent the total quantity of units
demanded for part / is a specified time period T, say a total of T patrols.* We consider a probability
model in which the quantity y for a given part (more precisely, y/) is a random variable with a Poisson
distribution given by
(1) Piy\d)= e ( , ' ,
y!
where 8 (again more precisely, 8i) is the parameter of the Poisson distribution of demands for item /
in a unit time period, in our case, a single patrol. t Note that 8 is thus the expected number of units de-
manded for part /in a patrol. It is assumed further that demands for part / in non-overlapping periods
of time are independently distributed.
Our problem is to estimate 8 for any item / classified as belonging to class C. We distinguish two
cases. In the first case, we are concerned with estimating 8 for installed items for which usage data,
i.e., y values are available. As indicated earlier, the problem here is complicated by the fact that for
the majority of items no usage is recorded, i.e., the observed y values during T time periods are zero. In
the second case, we would like to estimate 8, that is, the expected usage, for items classified as belong-
ing to class C, but being installed for the first time. In this case, no y values, zero or otherwise, are
available.
In both cases, it seems intuitively reasonable to assume that positive usage data for some members
of the class should be useful in determining estimates of the 8 values for the remaining members. We
formalize this by postulating that 8 is itself a random variable with a probability distribution over all
items in the class C. We then use standard theory to obtain the desired estimates for both cases men-
tioned above.
In general, if p{8) denotes the probability distribution of 8 in the class C, and p(y, 8) the joint
distribution of y and 8, then
(2) P (y, 8) = p{y\8) ■ p{8)
= p(8\y)- P (y),
where p{y) denotes the unconditional distribution of y values for the class C, and p(8\y) the conditional
distribution of 8, given y. ** In the first case mentioned above in which the observed y=0, 1, 2, . . .,we
estimate 8 by
~8 = E(8\y)
from the conditional distribution of 8. In the second case, when y values do not exist, we estimate 8
*For simplicity, we do not distinguish between the random variable y and the values which it assumes. Throughout, we
also use the symbol p{ ) to represent different probability distributions.
tThe subscript / has been omitted from y and here and in the subsequent discussion to facilitate exposition.
**In Bayesian terms. />(0) is the prior distribution, and />(W|y) the posterior distribution.
538 S. E. HABER AND R. SITGREAVES
by the value E{6), the unconditional expected value of 6. In this latter case, the estimate of 6 is the
same for all new items in the class C, while in the former case 6 varies for each item in class C depending
on its y value.
In considering possible distributions for 8, we assume that the class C can be extended in such a
way that 6 can be treated as a continuous variable with a probability density function. The preponder-
ance of y values of zero in most classes led us to consider densities whose maximum value occurs for
6 — 0. We examined first the exponential density
1 -9-
p(0)=-e $ for 0<6<*>,
but resulting calculations did not give evidence of a good fit. A natural extension, which because of
its mathematical properties seemed particularly appropriate for the distribution of a Poisson param-
eter, is a two-parameter gamma distribution. Accordingly, we assumed that
pw-.»-(|)'^ fur0<9< -
with a, /3 > 0. For any value of a < 1, this function is infinite at = 0, and is monotonically decreasing
as 6 increases from to °°.
From (1) and (3), Eq. (2) can be written specifically as
a a0a + y-\ e j3 Ty
(4) p(y,e\a,p)=p(y\e)-p(e\a,p)= ^ r{a)yl
a+Tp\ a + y 6 a + y~ 1 e g r(a + y) a a (T($)y
p } r(a + y) T(a)y\ (a+Tp) a + v
= p(0|y, a, (3) ■ p(y\a, /3).
Thus, the conditional distribution of 6, given y, also has the form of a gamma distribution, while the
unconditional distribution of y for the class C is a negative binomial.
From (4) and (3), we find
for the first case where y values are available while
(6) E(6)=p,
for the second case where items are being installed for the first time and there are no y values.
For any item in class C with an observed y value, we now have from (5) an estimator for the expected
usage for the item, namely 6. The estimator can be evaluated for every y value, including zero, if we
ESTIMATING USAGE OF REPAIR PARTS 539
have values for the two parameters a and B. We estimate these parameters from the observed set of
y values for the class C, treating these values as a set of independent observations from a negative
binomial distribution with mean value TB and with variance TB ( 1 +— ) .
Letji,y 2 , . . ., y„ be the observed y values for the n items /= 1, 2, . . ., n in class C. From the
data we estimate the mean and variance by
1 "
y=-£ y.and
i=l
1 "
V— _ , ^ (yi — y) 2 , respectively.
!=1
We estimate TB by y so that
y
/3 = ~
In estimating a, we use the method of moments since this is relatively simple and straightforward.
/ TB\
Since the variance of y is TB I 1 H I, we estimated the variance as
* / TB\ y
7)8 I 1 H — t I and with B = f;, from above,
obtain
a —
V-y
Hence, in the case where an observed y value is available for a given item, the desired estimate of
6 for the item is
T'B a + y
and in particular when y = 0, we have
a + TB T
~^>0.
a + TB
tb y
For y > 0, as 7 becomes large, the quantity — „ approaches the value 1, and 6 approaches -=,•
a + TB T
In the case of new items being introduced into the inventory system, the estimated expected usage
2
t
part class describing the new item.
for the item is given by fi= - where it is seen from (6) that B is the expected value of 6 for the repair
540
S. E. HABER AND R. SITGREAVES
2. EVALUATION OF GOODNESS-OF-FIT
In the preceding section, we assumed that the distribution of expected usage for items in a given
class C was a two-parameter gamma distribution. This, in turn, led to a negative binomial distribution
of demands for items in the class. In this section, we examine the goodness-of-fit of the model just
described. It will be recalled that a similar assessment of the earlier exponential model led to its
rejection. The purpose here is not an exact test of a particular hypothesis, but rather to determine the
reasonableness of the model finally adopted. An additional test of the model in an inventory context is
provided in the next section.
In examining the goodness-of-fit of the model, a large number of repair part classes were defined
on the basis of nomenclature and for each class a. and fi were computed from the available data.
Having obtained these estimates, theoretical negative binomial distributions of demands for items in
each class were calculated and compared with the actual distributions of y values. The comparison of
the actual and theoretical frequencies for each class was made by computing the value of chi-square as
an index of goodness-of-fit. Again, it was not the purpose to use each of the chi-squares as a rigorous
test of the corresponding null hypothesis. The intent was to utilize the chi-squares and the associated
significance probabilities as the basis for assessing the appropriateness of the model.
In evaluating the results, the following points should be kept in mind. First, because of the vagaries
of reporting, no model may provide a satisfactory fit to the data. For example, extremely large y values
may be expected as a result of mispunched data or stockpiling of material. Additionally, demands for
repair parts are often for even numbered quantities. The prevalence of demands for even quantities
may be seen from the distribution of y values for 61 patrols shown in Table 2.
Table 2. Distribution of 25,138 Different Repair Parts By the Total Quantity of Units Demanded
During 61 Patrols
Total demand
No. of different
Total demand
No. of different
quantity "
repair parts
quantity a
repair parts
3
249
14
26
4
249
15
28
5
121
16
27
6
124
17
9
7
86
18
20
8
97
19
14
9
58
20
38
10
61
21
17
11
36
22
13
12
57
23
10
13
29
24
18
"For the total sample of 25,138 different repair parts, items with a total demand quantity of 0, 1, 2 units during 61 patrols
were 21,597, 1,027, and 495, respectively. The number of different repair parts with a total demand quantity of 25 or more was 632.
Second, the nature of the chi-square statistic itself is such that relatively small differences be-
tween observed and expected relative frequencies will load to large chi-squares if the sample is large.
For the purpose of evaluating goodness-of-fit within a demand prediction context, this relation is an
important one.
In performing the goodness-of-fit computations, we were able to determine the significance of
chi-square for 54 classes of repair parts containing 18,847 different parts. For the repair classes ex-
ESTIMATING USAGE OF REPAIR PARTS
541
amined, no correction was made for the phenomenon of even quantity demands. It was possible, how-
ever, to correct for the presence of outliers. Items in a repair part class were treated as outliers and
eliminated if the smallest y value omitted was large relative to the largest y value included. In almost
all cases the outliers had a very low unit price, or a high total installed population, or large individual
demand quantities, or a combination of these characteristics. For example, in the repair part class
"filters" containing 370 different filters, one filter had a total demand quantity of 320 units — all units
of the item being demanded in a single transaction. Of the 18,847 parts in the sample, this item and
37 other repair parts were eliminated as outliers.*
The results of the goodness-of-fit computations, after elimination of the 38 items considered to
be outliers, are shown in Table 3.
TABLE 3. Summary of Chi-Square Computations
Different repair
parts in class
Number of
repair part
classes
Number of classes with
poor fit at
0.05 level
0.01 level
100 or Less
101 to 499
500 and Over
10
30
14
1
7
6
1
3
Total
54
14
4
Over all classes, poor fits were obtained for but 4 and 14 of the 54 repair part classes at the 0.01
and 0.05 levels, respectively. As may be seen from Table 3, the incidence of poor fits increased as the
number of repair parts in a class increased. In interpreting the results of Table 3, the earlier observa-
tion that where the number of items in a class is large, discrepancies between observed and expected
relative frequencies may still be small, should be recalled. Indeed, this was the case for almost all
of the repair part classes where the chi-square was larger than expected on the basis of chance alone.
3. FURTHER ASSESSMENT OF THE MODEL
In addition to examining the goodness-of-fit of the model, shipboard allowance lists were com-
puted using as input the demand prediction model previously described. These lists were then com-
pared with an allowance list utilizing technicians' usage estimates, both in terms of dollar investment
in stock and shortage counts. The purpose of this evaluation was (1) to simulate the performance of
the model in the environment for which it was designed, and (2) to determine whether differentiating
repair parts by nomenclature class represented an improvement over a simpler approach of grouping
all items into a single class.
The data base for an initial test consisted of 61 patrols of usage history. The items included in this
initial test fall into the first category of repair parts distinguished in this paper, i.e., items for which
usage data are available including data for items with "usage" of zero units. Employing past usage
*The total of 38 outliers was concentrated in 16 repair part classes. No class with 100 or fewer different repair parts con-
tained any outliers; 10 of the 30 classes with 101 to 499 parts contained outliers; while the remaining 6 classes with outliers
came from the 14 classes with 500 or more parts. This distribution is not inconsistent with the plausible hypothesis thai the
probability of observing an outlier in a given class increases with the number of different parts in the class.
542
S. E. HABER AND R. SITGREAVES
data for the 61 patrols and the demand prediction model, usage rates were computed for each repair
part under two procedures: (1) different a and (3 were computed for each nomenclature repair part
class (Model II A), and (2) a single value of a and /3 was used for all repair parts regardless of nomen-
clature class (Model II B).* Allowance list quantities were then computed for these procedures and the
one incorporating technicians' usage estimates (Model I) using the inventory model described in [4]. In
all cases the inventory model was used with the same parameters. Thus, the only difference in the com-
putation of the allowance lists was the technique used for deriving usage estimates. The allowance list
quantities were next compared against usage data during a subsequent 21 patrols. The data for these
patrols were not used in the initial calculation of usage rates. After each new patrol the model allow-
ance list quantities were updated. No updating procedure was available for the quantities computed
using the technicians' estimates.
Summary data describing the allowance list computed for items with previous usage history are
shown in Table 4. As may be seen, Model I was about three times as expensive as the other two models.
In terms of depth or number of units stocked, Model I stocked almost five times as many units as the
other models. In terms of range or number of different items stocked, both Models I and II A stocked
more items than Model II B. Thus, one effect of distinguishing among repair part classes on the basis
of nomenclature was to increase the range of repair parts stocked by the model.
Table 4. Range, Depth, and Dollar Value of Investment:
Items With Previous Usage History a
Model
Range of
items
stocked h
Depth of
units
stocked c
Dollar
value
I
II A
II B
18.6
18.9
16.0
112.3
25.4
22.5
2,703.1
960.4
854.7
a Averages for 21 patrols. All figures in thousands.
b Number of different repair parts stocked.
c Number of units stocked.
The average range or number of different items with a shortage and the average depth or number
of units short per patrol are shown in the first and second column, respectively, of Table 5. It should
be remarked that the latter measure is not without difficulty of interpretation due to the problem of
mix of different units of measure among items, e.g., some items are measured in feet while others are
in units of "each." For the sake of completeness, however, this measure is included as an alternative
measure of performance.
In Table 5, shortage counts are provided separately for items not stocked and for items stocked.
These two categories of stock are distinguished since items in the former category tend to be "not
carried" over successive patrols. From Table 5, it is seen that for items stocked, there were on the
average 19.5 and 23.1 different items with a shortage per patrol for Models I and II A, respectively.
Over all items with a shortage, the total number of units short averaged 172.3 and 225.0. In terms of
the number of units short per item short, Model I averaged 8.8(172.3^-19.5) as compared to 9.7
(225.0 h- 23.1) for Model II A.
Tor Model II B, a = 0.00787 and /&= 0.02414.
ESTIMATING USAGE OF REPAIR PARTS
543
Table 5. Shortage Counts: Items With Previous Usage History 3
Items
Shortages: All items
Range b
Depth c
Not stocked:
Model I
Model II A
Model II B
2.5 (2.2)
3.0 (2.8)
6.7 (4.0)
6.8 (13.3)
4.8 (5.1)
12.1 (8.8)
Stocked:
Model I
Model II A
Model II B
19.5 (12.8)
23.1 (14.3)
21.2 (13.8)
172.3 (132.2)
225.0 (196.3)
219.6 (195.0)
a Averages for 21 patrols. Standard deviation in parentheses.
11 Number of different repair parts with shortages.
c Number of units short over all repair parts.
A second test similar to the one described above was performed for 4,094 items which were treated
as new items being introduced into the system. It should be noted that none of these items were in-
cluded in the previous test. Following the model, in developing usage rates for this test, only the param-
eter j8 was used. For Model II A, /3 varied from class to class; for Model II B, f3 was invariant for all
items. In each case, the /3 value used was the same /3 value employed in the first test. Thus this second
test was a more stringent one in that not only were inventory quantities matched against unknown
future usage (for 35 patrols), but in estimating item demand distributions the input data were from a
completely different set of repair parts.
Summary figures describing the allowance lists and shortage counts for items which were treated
as new items being introduced into the system are shown in Tables 6 and 7, respectively. The format of
these tables is the same as for Tables 4 and 5.
Table 6. Range, Depth, and Dollar Value of Investment:
Items With No Previous Usage History a
Model
Range of
items stocked b
Depth of
units stocked c
Dollar
value
I
II A
II B
3.9
3.7
3.2
18.9
4.4
3.8
450.2
231.8
145.2
a Averages for 35 patrols. All figures in thousands.
b Number of different repair parts stocked.
c Number of units stocked.
From Table 6, one notes that as in the case for items with previous usage history, Model I was the
most expensive one. The additional dollar value of investment for Model I was once again accounted
for by the large number of units stocked, given that an item was stocked. Likewise, the range of dif-
ferent items stocked was least for Model II B.
An examination of Tables 5 and 7 indicates that for items not stocked, Models I and II A performed
about the same; Model II B performed less well than the other models because of its reduced range of
items stocked. For stocked items, however, Model I performed better than the other models; the per-
formance of Models II A and II B was very similar. Thus, on the basis of the shortage measures alone,
Model I was ranked higher than Model II A because it had fewer shortages for stocked items. Model
II A was ranked higher than Model II B because it had fewer shortages for nonstocked items.
544
S. E. HABER AND R. SITGREAVES
Table 7. Items With No Previous Usage History
Items
Shortages: All items
Range b
Depth 1
Not Stocked:
Model I
Model II A
Model II B
1.0 (1.2)
0.9 (1.4)
6.0 (6.4)
1.7 (2.1)
1.5 (2.0)
10.1 (10.0)
Stocked:
Model I
Model II A
Model II B
3.6 (3.5)
6.1 (4.3)
5.3 (4.0)
33.8 (53.6)
48.4 (58.8)
44.6 (57.7)
a Averages for 35 patrols. Standard deviation in parentheses.
'' Number of different repair parts with shortages.
c Number of units short over all repair parts.
One should note that the difference in performance between Models I and II A was small. Model
I had 3 to 4 fewer items with a shortage per patrol; given a shortage, the number of units short per
item short was at most one less for Model I. On the other hand, the difference in investment cost be-
tween the two models was substantial. Model I was approximately 2 to 3 times as expensive as Model
II A. The difference in performance between Models II A and II B, was about the same as that between
Models I and II A. In terms of investment cost, however, Model II B was somewhat less expensive
than Model II A.
The finding of small differences in performance between models is reinforced by an examination
of shortage counts for those repair parts which were highly essential.* Shortage counts for this class
of items are found in Table 8. As may be seen, for these items, with the exception of the depth shortage
measure for stocked items with no previous usage history, all models performed about the same.
Table 8.
Shortages
for Highly Essential Items
Highly essential items 3
With previous
With no previous
Items
usage history
usage history
Range
Depth
Range
Depth
(1)
(2)
(3)
(4)
Not stocked:
Model I
0.1
0.2
(0)
(0)
(0.2)
(0.7)
Model II A
(0)
(0)
. (0)
(0)
Model II B
(0)
(0)
(0)
(0)
Stocked:
Model I
2.2
19.1
0.1
1.7
(3.4)
(27.4)
(0.2)
(7.8)
Model II A
2.4
23.3
0.4
5.1
(3.7)
(33.9)
(0.6)
(11.2)
Model II B
2.3
23.2
0.6
5.5
(3.6)
(33.9)
(1.2) (11.8)
1 Averages for 21 patrols in Cols. 1 and 2; Averages for 35 patrols in Cols. 3 and 4; standard deviation in parentheses.
*A discussion of military essentiality coding of repair items is found in [3].
ESTIMATING USAGE OF REPAIR PARTS 545
Based on the findings of this section, we conclude that relative to the substantial difference in
investment cost between Model I and Model II A, the difference in performance between these two
models was small. Considering cost as well as performance, Model II A was judged superior to Model I.
Because of the fewer shortage counts for Model II A vis-a-vis Model II B and the similarity of costs
between them, Model II A in which items were distinguished by nomenclature class was judged superior
to Model II B where all items were lumped into a single clsss.
4. SUMMARY
In this paper a model is presented which focuses directly on the difficult problem of predicting
demands for items with extremely low usage rates, which form the bulk of repair parts in military
systems. In the model, repair part demands are assumed to be Poisson distributed while their means
are assumed to be gamma distributed. A basic notion underlying the model is the pooling of usage data
for items that have shown some movement for the purpose of estimating usage rates for those items
which have shown no movement.
At the outset, repair parts were partitioned into different classes. An assessment of goodness-of-fit
was performed for 54 different classes of items to determine whether the unconditional distribution of
demands was indeed negative binomial distributed as postulated by the model. Given the vagaries
of the data, e.g., disproportionately large numbers of even demands and large outliers due probably
to mispunched data and stockpiling of material, the model fit the data quite well. Although the parti-
tioning of repair parts was not essential to the model, it was assumed that such partioning would yield
improved estimates of usage rates. The goodness-of-fit computation and other tests conducted in an
inventory context suggest that this indeed was the case.
A unique feature of the model is that in addition to providing positive usage estimates for repair
parts with previous usage history, regardless of whether or not a particular part was observed to move,
the model also makes possible the estimation of usage rates for new items for which no previous usage
history is available. In an inventory context under stringent test conditions, the model performed equally
well under both contexts, when compared with the current procedure for estimating usage rates. Mean
shortage counts for the model were slightly higher over all items and about equal for highly essential
items as mean shortage counts for the current procedure. On the other hand, differences in cost were
marked with the current procedure costing two to three times as much as the proposed model. As
indicated by this study, the notion of pooling usage data, and from such data extrapolating usage rates
for installed items with zero usage of for items being newly introduced into a system, is a useful one.
ACKNOWLEDGMENT
The authors wish to thank William Hise and Talbot Walls, Jr., for their programming assistance.
REFERENCES
[1] Astrachan, M. and A. S. Cahn, (Editors) Proceedings of Rand's Demand Prediction Conference,
January 25-26, 1962, The Rand Corporation RM- 3358 -PR (1963).
[2] Brown, R. G., Statistical Forecasting for Inventory Control (McGraw-Hill, New York, 1959).
[3] Denicoff, M., J. Fennell, S. E. Haber, W. H. Marlow, F. W. Segel, and H. Solomon, "The Polaris
Military Essentiality System," Nav. Res. Log. Quart. 11, 235-257 (1964).
546
[4] Denicoff, M., J. Fennell, S. E. Haber, W. H. Marlow, and H. Solomon, "A Polaris Logistics Model,"
Nav. Res. Log. Quart. 11, 259-272 (1964).
[5] Haber, S. E., R. Sitgreaves, and H. Solomon, "A Demand Prediction Technique for Items in Mili-
tary Inventory Systems," Nav. Res. Log. Quart. 16, 302-308 (1969).
A STOCHASTIC CONSTRAINED OPTIMAL REPLACEMENT MODEL*
Peter J. Kalman
Department of Economics
State University of New York
Stony Brook, N.Y.
ABSTRACT
In this paper a stochastically constrained replacement model is formulated. This model
determines a sequence of replacement dates such that the total "current account" cost of
all future costs and capital expenditures over an infinite time horizon for the n initial incum-
bent machines is minimized subject to the constraints that an expected number of machines
are in a chosen utility class at any point in time. We then indicate one possible solution method
for the model.
I. INTRODUCTION
The paper is structured as follows. We first define the basic notation in section 2. Next, in section 3,
an analytical model is developed- which determines a sequence of replacement dates for the ith initial
incumbent machine (f=l, 2, . . ., n) such that the total "current account" cost of all future costs
and capital expenditures over an infinite time horizon is minimized subject to the constraint that at
any point in time there exists a desired expected number of machines in any chosen "utility" class.
We then discuss one possible solution (out of many) under specified assumptions.
II. NOTATION
Suppose there exist n initial incumbent machines. In order to define the model, the following
notation is useful:
uj, the utilization rate of the jth machine in the sequence of replacements for the £th initial
incumbent machine, i=l, . . ., n, j=l, 2, . . . (from here on the underlined is
abbreviated by MR I My) ;
Li, the age of MRIMy when the decision to replace it is made;
lj, the age of MRIMy when it is replaced;
t, the age of MRIMy, s£ r «£ /];
C'j, the utility class of MRIM,j (Cj=l, 2, . . ., q in decreasing order of desirability. That
is, each machine is represented by an ordinal measure of its utility characterized by one
of the above integers. Moreover, every machine belongs to one and only one utility class
at any point in time.);
Rp a parameter indicating whether MRIMy is new or a modernization
*This work was started while the author was on the professional staff of the Center for Naval Analysis. It was completed
while the author was a visiting professor at the Naval Postgraduate School.
547
548 P- J- KALMAN
T 1 if MRIM is new \
3 [0 if MRIMy is a modernization/'
t'j, the time when a decision to purchase a replacement for MRIMy is made, 0^££J=Soo;
xi-(t), the expected number of machines in the ith sequence in utility class k at time t;
Xk(t), the expected number of machines in utility class k (k — 1, . . ., q) at time t (0 =£ t =£ °°);
Pkj(r), the probability that a machine which was in class k when it began operating will have
moved to classy by the time it is t years old;
t', the time the latest replacement machine in the ith sequence began operating;
t l , the starting time of the latest modernization in the ith sequence.
In the model to be presented, the term "replacement" includes "modernization" as well as new
machine acquisition. The same applies to "purchase of a replacement". Hence, if MRIMy is "modern-
ized", then the modernized version is called "new" and a "new machine" when it enters the process.
It follows that "age" represents "duration of time in the process". The symbols L), C) and R] represent
decision variables and Cj refers to utility class when new.
III. THE MODEL
The model defined below allows a three-way choice among not replacing or modernizing or building
a new machine at any point in time. Furthermore, the model will determine a sequence of replacement
purchase times (i.e., {t[}, j=l, 2, . . ., for each i, i = l, . . ., n) such that the total "current
account" cost of all future costs over an infinite time horizon is minimized subject to the constraints
that at each point in time there exist a prescribed expected number of machines in each chosen utility
class k (k=l, 2, . . ., q) . It is assumed that there exist n initial incumbent machines.
The operating expence (cost) function for MRIMtj is
(3.1) flits,, tj-u t, R h Cj)
where it is understood that the superscript outside a function applies to all the appropriate arguments
of that function. Clearly, «j, t, i?j and C'j influence the operating expense of MRIMy. The time at
which the decision to replace is made (tj_ 1 ) determines the technological state of advancement of
the machine to be installed and hence can also influence </>'.
In addition to operating cost, there is one other major category of cost which must be considered —
the investment cost {W'f) of MRIMy. This will depend upon whether it is a new machine or a modern-
ization and on the utility class chosen for it. That is,
(3.2) W)(Rj,Cj).
Since machines normally have some salvage value at the time of replacement, W'j is not the net capital
cost of a machine. To obtain this, we subtract the salvage value (which depends on the replacement
age) of MRIMy. Hence, net capital cost of MRIMy is
(3.3) W^R j ,C j )-Sj(lj)e-'- l j.
We like to note that salvage value may also depend on absolute time, but for simplicity we omit this.
There is one other important type of element to be considered before one can formulate the "current
STOCHASTIC CONSTRAINED REPLACEMENT MODEL
549
account" cost. This is the time required to accomplish a replacement of MRIM,j of age L'jif the replace-
ment action starts at time t'j. Before defining this relation, we like to point out that if a machine is to
be replaced by an acquisition, it will remain in service till the new arrival, whereas it may be immedi-
ately withdrawn if it is to receive modernization. The time required to accomplish a replacement of
MRIIVLj of age L^ if the replacement action starts at time t'j\s represented by
yj(Aj> *>)■
The L) and the t'j are related by
l ^0'
t\ = ti ,+yi ,(Lj-i,tj-i)+L i .,j^2,i=l,2,. . ., n.
For future notational convenience, let y\ represent
y'.(Lj, tj),j = 0,l,2, . . .,i=l, . . .,n.
Define r as the continuous rate of discount. Also, define the current account cost (w') as the outlay
stream that has the same present value as all the cost outlays associated with the ith initial incumbent
machine in an infinite chain of replacements. That is,
(3.4)
w
["foiuo, t, R„, C )e"dr- Sh(l )e- rli
i r r ''
?-r<t l + y' ) I
Uo
4- p-tit' +yi )
> c m m
y,(ui, t , r, /? 1 ,C 1 )e-^T+rj(/? 1 ,C 1 )-S' 1 (/,)e- r 'i
<f>' m+i (u m +i, t m , T, R m +i, C rn +l)e~ rT dT+ W' m+1 (R m+ i, C„,+i)
^m + 1 V 'm+1 /<? m+1
+
where i= 1, . . . , n,
/' =
L' m+i if the (m+ l)st machine's replacement in the ith sequence is a modernization,
L'm+i + 7m+i if a new machine is built, m = — 1, 0, 1, 2, . . ., and <f>' {u , t, R , C ) is the oper-
ating expense function for the ith incumbent machine.
Note that the investment cost of the ith incumbent machine, W Q , does not appear in (3.4) since, by
hypothesis, the incumbent machine is a sunk investment whose capital cost should not be allowed to
influence decisions. The integrals in (3.4) discount the cost stream of each replacement back to the
point in time when the machine was new. The exponential term outside each set of brackets then dis-
counts all these cost streams back to the present time. Similarly, the second and third terms inside
550 p - •<• KALMAN
each bracket, when discounted, determines the present value of all future investment outlays net of
salvage values. Finally, I would like to note that equation (3.4) can be reduced to its discrete analogue
in the dynamic programming framework under appropriate assumptions. 2
The constraint set will now be formulated. It will be formulated first as an algebraic system and
then as a differential equation system. The choice of which formulation to use will depend upon the
application of the model. From the definitions (of section 2) it follows that
(3-5) *,(*) = i i (M'-MmM).
i=l k=\
7=1,2, . . ., q,
with the constraint
(3-6) Xj{t)=Xj{t), j=l, . . ., q
where Xj is chosen by the decision maker (we specify a way for calculating the jc's below), and
[t 1 ] =max {V, ?},
£,. v _ f 1 if the machine in use at time t in the tth incumbent machine's sequence is in utility class./
^ W ~l0ifitis not.
Note that %)(t) is a function of current time t. If a machine in the ith sequence is in the process of
being modernized at time t, then £j(0 = for all/. Consider the inner summation first. £j( [t'] ) indicates
the utility class which we selected for the machine currently in use at time t in the tth sequence when
it was new, or tells us that a modernization is currently being undertaken, and consequently, that no
machine in the ith sequence is currently in use (i.e., £i([£']) = for all k). For sequence i in which
modernizations are not being undertaken at time t, g k ([«']) = g' k (V) = 1 for one and only one k. If
[t '] = }', then t— [f] gives the age of the machine in use at time t in the ith sequence. If [f] = ~t \
the pkj{t— [t']) term is irrelevant since ^j f ([t']) = for all k. Thus, the inner sum tells us, for each
sequence i in which a machine is in use at time t, the probability of that machine being in utility classy
at time t, given the class of the machine when it was built. Then, with the outer summation, we simply
sum over all machines in use at time t.
In the above formulation, the constraints are functions of the replacement purchase times ([<'] is
either a replacement purchase time or a time at which a replacement machine begins operation, which
is related to a replacement purchase time by the function y') and the utility classes which we choose
for the replacement machines, as reflected in the values of tj k ( ). The constraints may have a count-
able number of discontinuities in an infinite horizon model, since it is likely that there will be a 'bump'
every time we are able to choose a utility class for a machine (i.e., when we have a modernization or
build a new machine).
An alternative formulation of the constraint set will be given via a system of differential equations.
This formulation involves the solution to a differential equation system while the above formulation
does not.
For each sequence i, (i=l, . . ., n) we have the following matrix differential equation:
(3.7) ^7=P(t-[tq)X i
dt
See []].
STOCHASTIC CONSTRAINED REPLACEMENT MODEL 551
where P is a q X q transition probability matrix of transition probabilities Pij(t— [t']), X' is a qX\
matrix of x'jit)^, subject to the initial condition
(3.8) X>([t i ])=X'(0)
where Xi, is a q X 1 matrix whose 7th element is xj( [t'~\) where
1 if [*'] = l' and we selected utiUty class;
*]([*']) ~ \ f° r tr >is machine when it was built,
otherwise.
That is, if modernization is going on at time t in the ith sequence, we do not have to be concerned about
any transition probability between classes because no machine is in use. Otherwise, we look back to see
in what class we put the machine in use at time t in the ith sequence when it was new. In system (3.7)
they'th equation
^}r=ipxj(t- M)4 (t) + (Pjj(t-[r])-i)xj («),
al k=\
J=l, . . ., q
states that the net rate of change of the number of machines in ith sequence in classy in an infinitesimal
time period equals the total number of machines which enter class; in the ith sequence at time t minus
the number of machines in the ith sequence which leave class; for other classes.
If we assume that the matrix P is an analytic coefficient matrix, then our finite system of first-order
linear differential equations with analytic coefficient functions (3.7) subject to the initial value (3.8)
admits a unique analytic-function solution, which can be found explicitly by equating coefficients in the
relevant power series. 3 Clearly, there are an infinite number of functions in this class which could be
used to represent the transition probabilities as a function of time. Linear or exponential functions are
just two elements of this class which we can use.
One should note that we have a set of q differential equations for each sequence i. Let us denote:
TNDM,, the time of the next decision to modernize a machine in the ith sequence;
TNRO,, the time the next replacement machine begins operation in the ith sequence.
For fixed i, we may have a different set of initial conditions for each interval of the form [t']
=st< min {TNDM,, TNRO,}, reflecting the series of decisions we can make concerning the class of the
i
replacement machines. Thus, for each i, we would have at most q non-trivial sets of differential equa-
tions to solve, corresponding to the q different non-trivial initial conditions we may have. Of course, a
solution corresponding to a given set of initial conditions may apply to different length intervals of the
form U'} *£ t < min {TNDM,, TNRO/}.
Thus, the x'j(t) will be functions of the replacement purchase times and the utility class decisions.
3 See [2], pp. 89-92.
552 P - J- KALMAN
n
Then Xj{t) = V x'j(t). However, we could only hope for continuity of Xj(t) for
i=l
max [t*] *£ t < min {TNDM,, TNRO,}.
i i
That is to say, xj(t) would be continuous only in intervals te (1 ([*'], min {TNDM;, TNRO,}). The
i i
same is true of the continuity of Xj{t) in our non-differential equation approach. For example, suppose
we have two machines
Machine 1 _| | i ;] I
t\ tz \ t tz
Machine 2 _| I I I [_
t 2 t f 3
Specifically, suppose at £ 2 a decision is made to build a new machine in sequence 1, while t! 2 involves
a decision to modernize the machine in use in sequence 2. Thus [t 1 ] — ti, [£ 2 ] = 4Thus
n ([**], min {TNDM,, TNRO*}) = {t' 2 , h),
i i
as denoted by the dotted lines. It is within this interval that no decisions regarding the utility class of
the machines are made. Consequently, no "bumps" are experienced. Instead, machines change classes
smoothly according to the transition probabilities.
It should be emphasized that the main factor behind the two different approaches to the constraint
formulation is the definition of the transition probabilities. It seems likely that the first approach is
preferable from a practical standpoint.
If the transition probability matrix is a constant matrix P it is well known that system (3.7) has the
following unique solution
k m j
(3.12) I^^/'-WK/j
where /u-i, . . . , /u* are the eigenvalues of P with multiplicities wii, . . . , m/?, respectively and Vij is
the eigenvector associated with /j.j. The solution (3.12) represents the number of machines in class
1(1=1, . . . , q) in the ith sequence at time t.
We now formulate the constrained optimization problem of determining a sequence of replacement
times such that the total current account cost of all future costs and capital expenditures over an
infinite time horizon of the n incumbent machines is minimized subject to the chosen constraints that
Xk(t) machines are in utility worth class k at time t (&=1, . . ., q). That is, we want to minimize
n
V w' where w' is defined by (3.4) subject to the constraints Xj(t)=Xj(t), 7=1, . . ., q which are
i=l
defined by (3.5). One possible way to proceed is to assume that the constraints are exactly satisfied
for all values of t and make a discrete approximation of the objective function. Then we can apply
dynamic programming techniques in order to solve the problem.
As an exercise we have solved the model under the following assumptions:
(1) a constant level of utility over time;
(2) a linear operating cost function;
STOCHASTIC CONSTRAINED REPLACEMENT MODEL 553
(3) a constant rate of machine utilization;
(4) instantaneous replacement;
(5) equal economic lives of all replacements after the first;
(6) equal investment costs of all machines.
IV. ACKNOWLEDGMENT
I would like to thank G. E. Bowden, L. Ravenscroft and D. Richter, who are members of the
professional staff of the Center for Naval Analysis for their helpful comments.
REFERENCES
[1] Dreyfus, S., "Dynamic Programming and the Calculus of Variations," Academic Publishing Co.,
1965.
[2] Hochstadt, Harry, "Differential Equations: A Modern Approach," Holt, Rinehart, Winston Pub-
lishing Co., 1964.
[3] Hotelling, H., "A General Mathematical Theory of Depreciation," Journal of the American Statistical
Association, p. 340, XX, September 1925.
[4] Jorgenson, D. W., J. J. McCall, R. Radner, "Optimal Replacement Policy," Rand McNally Pub-
lishing Co., 1967.
A NOTE ON THE CALCULATION OF EXPECTED TIME-WEIGHTED
BACKORDERS OVER A GIVEN INTERVAL
Chung-Mei Ho
AMC Inventory Research Office
ABSTRACT
Two formulae are presented for calculating expected time-weighted backorders over a
fixed time interval. One formula is a more precise form of a result found in the literature and
is found using a direct intuitive approach. The second formula is derived using the steady-
state distribution of inventory and is directly compatible with the use of steady-state (R,Q)
models.
The two formulae are compared and reconciled.
1. INTRODUCTION
A rather old problem is considered. An inventory system begins with N units of stocks, all on hand.
Inventory is depleted by demands which arrive in a random pattern. All demands occurring when the
system is out of stock are backordered. What is the expected number of time-weighted backorders
over a period of length L assuming that no stock arrives during the period LI The time-weighting is
linear, i.e., a backorder lasting for 2 weeks counts as much as two backorders lasting for 1 week.
Typically, of course, L represents a procurement lead time. Two solution techniques are compared.
One approach is to deduce the answer from a result obtained in the solution of steady-state (R, Q)
inventory models, such as described in Hadley and Whitin [2]. A more direct intuitive approach is
presented in section 2.
It is found that: (a) for Poisson distributed demand both approaches are equivalent though the
formulae found are very different in appearance, (b) when demand is large, and the Normal approxima-
tion to the Poisson is used, the results given by the two formulae are not the same, and (c) both give
reasonable approximations to the true solution, i.e., solution using Poisson demand. The two formulae
improve as R is close to or smaller than mean lead time demand.
The primary significance of this work is in reconciling the two approaches. Moreover, the direct
approach formula is given in a more precise form than we have come across in the literature before.
For example, see "Hanssmann" [3, pp. 47-48].
2. DIRECT APPROACH -POISSON DEMAND
We assume a stationary Poisson demand process. Given the initial on-hand inventory is N units,
what is the expected number of time-weighted backorders over the period L?
Let Y be the total demand, u be the expected value of demand, and Bi. be the time-weighted
backorders over L. Figure 1 follows to help in the derivation.
From the above diagram, the total time-weighted backorders are presented by the area fi/.. There
are Y — N orders outstanding just before the replenishment arrives. Numbering the customers back-
555
556
C. HO
h.
mnm
Figure 1
wards, let T\ be the waiting time of the (Y — N — i+ l)th customer backordered. Then T\ is the waiting
time of the last customer. Let the number of backorders given demand of Y be Bl\Y, then
Y-N
(2.1)
B L \Y= £ Ti
E(B L \Y) = Y E(Ti\Y).
i=i
For a Poisson process, given the total demand over the period, the arrival time of each customer
is a random sample from a uniform distribution. For a proof of this statement, see, for example, Karlin
[4, chap. 9]. To simplify the derivation and without loss of generality, L is assumed to be 1.
The TVs have a Beta distribution,* and
(2.2)
so
thus
(2.3)
where
E(T,\Y)
Y+l
*<W=?FTi=FTT
Y-N
= 1,2, . . ., Y-N
(Y-N+l)
E(B L )=-E
(Y-N)(Y-N+1]
Y+l
1 » (Y-N)(Y-N+1)
= 2 2 YTi P{Y)
P(Y) =
u,.Y
e-"u
3. RESULT IMPLICIT IN SOLUTION OF STEADY STATE (R, Q) MODEL
Consider an (/?, Q) system with fixed procurement lead time L. If demand is stationary and
Poisson, then using the steady-state approach t the expected total time weighted backorders per year is
(3.1) E(SSB yr ) = ±r ]T (Y-R-1)[P(Y)-P(Y+Q)]
V Y=R+1
where
QO
P(Y)=J i p(k).
*For reference, see Fisz [1, p. 377].
tSee, for example, Hadley & Whitin [2, p. 184].
CALCULATION OF BACKORDERS 557
Now as Q gets large, the probability that lead time demand will exceed Q approaches 0. If we
assume the probability is effectively 0, andL is 1, then
(3.2) E(SSB yr )=^ f (Y-R-l)P(Y)
and
(3.3) E{SSBcy)=^E(SSBy r )=l £ (Y-R-l)P(Y),
Y = R + l
where E(SSB C y) is expected time-weighted backorders per cycle. A cycle can be defined as the time
between initiation of successive procurements. Equation (3.3) follows from the fact that if lead time
demand does not exceed Q, there can be at most one procurement order outstanding at a time. If so,
this implies cyclic statistics such as cyclic backorders can be determined from yearly statistics as
shown.*
An implication of the assumption that there is never more than one order outstanding is that all/?
assets at the time an order is placed are on hand. Thus, backorders sustained over a cycle are equivalent
to backorders sustained during that part of the cycle which begins with the start of the procurement
lead time. This establishes an identity between backorders in a cycle and the problem posed in Section 2
(with R substituted for N).
n±\ sv» \ rro * 1 * (Y-R)(Y-R + 1)
(3-4) E(B L )=E(B cy ) = 2 2 y+l P ^'
V — R
4. RECONCILIATION OF DIRECT APPROACH TO STEADY STATE (R, Q) MODEL
Both Eqs. (3.3) and (3.4) are expected time-weighted backorders over a procurement lead time.
A few steps are needed to show that they are identical.
E(SSB cy )=- f (Y-R-l)P(Y)
U Y=R+\
=1 1 (Y-R-l,±P(k)
/=R+l k=Y
=\ X 2 (Y-R-Dp(k).
Y=R+\ k=Y
Interchanging the order of summation, we get
i t p(*> i (Y-R-D
k=R+\ Y=R+l
1 x k-R-l
=- u 2 P (k) 2 i
k=R+l i=o
1 ^ ... (k-R-D(k-R)
*See. for example, Hadley & Whitin [2, Chap. 4].
558 c. ho
Substitute Y=k— 1 to get
\% iy - R) T R+1) >*r+»>
Y = R
then use p(Y + 1) = TTT T p(Y) (which is a Poisson property) to obtain
1 ^, (Y-R)(Y-R + 1)
2 2 Y+l ^'
which is the same as Eq. (3.4).
5. NORMAL APPROXIMATION TO POISSON DEMAND
As total lead time demand increases, the Poisson distribution can be approximated by the Normal.
Using the Normal approximation to Eq. (3.4) we have
(5..) *«^-jj; g -«ff-« +i) jg)».
where
^vfe-(-H^) 2 ).
u is the expected lead time demand and cr equals Vu. Using the Normal approximation to Eq. (3.3),
we have
(5.2) E(SSB cy )=- | (Y-R)G(Y)dY where G(Y)= \ f{t) dt.
« J K J Y
Let w = G(Y) and dv= (Y-R)dY, and use integration by parts in (5.2). This gives
= ^- [ X {Y-RYf{Y)dY,
£U J R
which is different from Eq. (5.1).
The discrepancy is due to the use of Normal approximation. In equating Eqs. (3.3) and (3.4), the
key relation of
p(Y+l)=y±- [ p(Y),
is used which holds only for the Poisson distribution and not, the Normal distribution. The two formulae
are essentially the same, and empirically were found to be almost equally effective as approximations
to the solution obtained using Poisson distribution of demand.
ACKNOWLEDGMENTS
I am indebted to Mr. Alan Kaplan and Mr. W. Karl Kruse for their contributions to this work.
Appreciation is also extended to Professor Edward Silver for his helpful comments.
CALCULATION OF BACKORDERS 559
REFERENCES
[1] Fisz, M., Probability Theory and Mathematical Statistics (John Wiley & Sons, Inc., New York
and London, 1963).
[2] Hadley, G. and Whitin, T. M., Analysis of Inventory Systems (Prentice-Hall, Inc., Englewood
Cliffs, N. J., 1963).
[3] Hanssmann, F., Operations Research in Production and Inventory Control (John Wiley & Sons,
Inc., New York and London, 1962).
[4] Karlin, S., A First Course in Stochastic Processes (Academic Press, New York and London, 1966).
INDEX OF VOLUME 17
AHSANULLAH, M. and A. K. MD. E. SALEH, "Optimum Allocation of Quantiles in Disjoint Intervals
for the Blues of the Parameters of Exponential Distribution when the Sample is Censored in the
Middle," Vol. 17, No. 3, Sept. 1970, pp. 331-349.
ALTER, R. and B. LIENTZ, "A Note on a Problem of Smirnov: A Graph Theoretic Interpretation,"
Vol. 17, No. 3, Sept. 1970, pp. 407-408.
BALAS, E., "Machine Sequencing: Disjunctive Graphs and Degree-Constrained Subgraphs," Vol. 17,
No. 1, Mar. 1970, pp. 1-10.
BEGED-DOV, A. G., "Contract Award Analysis by Mathematical Programming," Vol. 17, No. 3, Sept.
1970, pp. 297-307.
BELL, C. E., "Multiple Dispatches in a Poisson Process," Vol. 17, No. 1, Mar. 1970, pp. 99-102.
BELLMORE, M., G. BENNINGTON and S. LUBORE, "A Network Isolation Algorithm," Vol. 17,
No. 4, Dec. 1970, pp. 461-469.
BENNINGTON, G., M. BELLMORE and S. LUBORE, "A Network Isolation Algorithm, Vol. 17,
No. 4, Dec. 1970 pp. 461-469.
BENNINGTON, G., and S. LUBORE, "Resource Allocation for Transportation," Vol. 17, No. 4, Dec.
1970, pp. 471-484.
BHASHYAM, N., "Stochastic Duels with Lethal Dose," Vol. 17, No. 3, Sept. 1970, pp. 397-405.
BHASHYAM, N., "Stochastic Duels with Nonrepayable Weapons," Vol. 17, No. 1, Mar. 1970, pp.
121-129.
BLUMENTHAL, S., "Interval Estimation of the Normal Mean Subject to Restrictions, When the Vari-
ance Is Known," Vol. 17, No. 4, Dec. 1970, pp. 485-505.
BOWMAN, V. J., JR. and G. L. NEMHAUSER, "A Finiteness Proof for Modified Dantzig Cuts in
Integer Programming," Vol. 17, No. 3, Sept. 1970, pp. 309-313.
BRANDT, E. B. and D. R. LIMAYE, "MAD: Mathematical Analysis of Downtime," Vol. 17, No. 4,
Dec. 1970, pp. 525-534.
BROWN, R. G. and G. GERSON, "Decision Rules for Equal Shortage Policies," Vol. 17, No. 3, Sept.
1970, pp. 351-358.
BURT, J. M., JR., D. P. GAVER, and M. PERLAS, "Simple Stochastic Networks: Some Problems and
Procedures," Vol. 17, No. 4, Dec. 1970, pp. 439-459.
COZZOLINO, J. M., "The Optimal Burn-In Testing of Repairable Equipment," Vol. 17, No. 2, June
1970, pp. 167-181.
CREMEANS, J. E., R. A. SMITH, and G. R. TYNDALL, "Optimal Multicommodity Network Flows
with Resource Allocation," Vol. 17, No. 3, Sept. 1970, pp. 269-279.
DAY, J. E. and M. P. HOTTENSTEIN, "Review of Sequencing Research," Vol. 17, No. 1, Mar. 1970,
pp. 11-39.
DISNEY, R. L. and W. E. MITCHELL, "A Solution for Queues with Instantaneous Jockeying and Other
Customer Selection Rules," Vol. 17, No. 3, Sept. 1970, pp. 315-325.
DUDEWICZ, E. J., "Confidence Intervals for Ranked Means," Vol. 17, No. 1, Mar. 1970, pp. 69-78.
EVANS, J. P., "On Constraint Qualifications in Nonlinear, Programming," Vol. 17, No. 3, Sept. 1970,
pp. 281-286.
561
562
GAVER, D. P., J. M. BURT, JR., and M. PERLAS, "Simple Stochastic Networks: Some Problems and
Procedures," Vol. 17, No. 4, Dec. 1970, pp. 439-459.
GERSON, G. and R. G. BROWN, "Decision Rules for Equal Shortage Policies," Vol. 17, No. 3, Sept.
1970, pp. 351-358.
GOODMAN, I. F., "Statistical Quality Control of Information," Vol. 17, No. 3, Sept. 1970, pp. 389-396.
HABER, S. E. and R. SITGREAVES, "A Methodology for Estimating Expected Usage of Repair Parts
with Application to Parts with No. Usage History," Vol. 17, No. 4, Dec. 1970, pp. 535-546.
HARRIS, M. Y., "A Mutual Primal-Dual Linear Programming Algorithm," Vol. 17, No. 2, June 1970,
pp. 199-206.
HARTMAN, J. K. and L. S. LASDON, "A Generalized Upper Bounding Method for Doubly Coupled
Linear Programs," Vol. 17, No. 4, Dec. 1970, pp. 411-429.
HITCHCOCK, D. F. and J. B. MACQUEEN, "On Computing the Expected Discounted Return in a
Markov Chain," Vol. 17, No. 2, June 1970, pp. 237-241.
HO, C, "A Note on the Calculation of Expected Time-Weighted Backorders Over A Given Interval,"
Vol. 17, No. 4, Dec. 1970, pp. 555-559.
HOTTENSTEIN, M. P. and J. E. DAY, "Review of Sequencing Research," Vol. 17, No. 1, Mar. 1970,
pp. 11-39.
JACQUETTE, S. C, "Suboptimal Ordering Policies Under the Full Cost Criterion," Vol. 17, No. 1,
Mar. 1970, pp. 131-132.
KALMAN, P. J., "A Stochastic Constrained Optimal Replacement Model" Vol. 17, No. 4, Dec. 1970,
pp. 547-553.
KAPLAN, A. J., "The Relationship Between Decision Variables and Penalty Cost Parameter in (Q, R)
Inventory Models," Vol. 17, No. 2, June 1970, pp. 253-258.
LASDON, L. S. and J. K. HARTMAN, "A Generalized Upper Bounding Method for Doubly Coupled
Linear Programs," Vol. 17, No. 4, Dec. 1970, pp. 411-429.
LIENTZ, B. and R. ALTER, "A Note on a Problem of Smirnov: A Graph Theoretic Interpretation,"
Vol. 17, No. 3, Sept. 1970, pp. 407-408.
LIMAYE, D. R. and E. B. BRANDT, "MAD: Mathematical Analysis of Downtime," Vol. 17, No. 4,
Dec. 1970, pp. 525-534.
LUBORE, S., M. BELLMORE and G. BENNINGTON, "A Network Isolation Algorithm," Vol. 17,
No. 4, Dec. 1970, pp. 461-469.
LUBORE, S. and G. BENNINGTON, "Resource Allocation for Transportation," Vol. 17, No. 4, Dec.
1970, pp. 471-484.
MCMASTERS, A. W. and T. M. MUSTIN, "Optimal Interdiction of a Supply Network," Vol. 17, No.
3, Sept. 1970, pp. 261-268.
MACQUEEN, J. B. and D. F. HITCHCOCK, "On Computing the Expected Discounted Return in a
Markov Chain," Vol. 17, No. 2, June 1970, pp. 237-241.
MALIK, H. J., "The Distribution of the Product of Two Non-Central Beta Variates," Vol. 17, No. 3,
Sept. 1970, pp. 327-330.
MANN, N. R., "Computer-Aided Selection of Prior Distributions for Generating Monte Carlo Con-
fidence Bounds on System Reliability," Vol. 17, No. 1, Mar. 1970, pp. 41-54.
563
MARKLAND, R. E., "A Comparative Study of Demand Forecasting Techniques for Military Helicopter
Spare Parts," Vol. 17, No. 1, Mar. 1970, pp. 103-119.
MAZUMDAR, M., "Some Estimates of Reliability Using Interference Theory," Vol. 17, No. 2, June
1970, pp. 159-165.
MITCHELL, W. E. and R. L. DISNEY, "A Solution for Queues with Instantaneous Jockeying and
Other Customer Selection Rules," Vol. 17, No. 3, Sept. 1970, pp. 315-325.
MOGLEWER, S. and C. PAYNE, "A Game Theory Approach to Logistics Allocation," Vol. 17, No. 1,
Mar. 1970, pp. 87-97.
MOREY, R. C, "Inventory Systems with Imperfect Demand Information," Vol. 17, No. 3, Sept. 1970,
pp. 287-295.
MUSTIN, T. M. and A. W. MCMASTERS, "Optimal Interdiction of A Supply Network," Vol. 17, No.
3, Sept. 1970, pp. 261-268.
NEMHAUSER, G. L. and V. J. BOWMAN, JR., "A Finiteness Proof for Modified Dantzig Cuts in
Integer Programming," Vol. 17, No. 3, Sept. 1970, pp. 309-313.
NIGHTENGALE, M. E., "The Value Statement," Vol. 17, No. 4, Dec. 1970, pp. 507-514.
NOLAN, R. L., "Systems Analysis and Planning-Programming-Budgeting Svstems (PPBS) for Defense
Decision Making," Vol. 17, No. 3, Sept. 1970, pp. 359-372.
PAYNE, C. and S. MOGLEWER, "A Game Theory Approach to Logistics Allocation," Vol. 17, No. 1,
Mar. 1970, pp. 87-97.
PERLAS, M., J. M. BURT, JR., and D. P. GAVER, "Simple Stochastic Networks: Some Problems
and Procedures," Vol. 17, No. 4, Dec. 1970, pp. 439-459.
PRESUTTI, V. J., JR., and R. C. TREPP, "More Ado About Economic Order Quantities (EOQ)," Vol.
17, No. 2, June 1970, pp. 243-251.
ROLFE, A. J., "Markov Chain Analysis of a Situation Where Cannibalization is the Only Repair
Activity," Vol. 17, No. 2, June 1970, pp. 151-158.
SALEH, A. K. MD. E. and M. AHSANULLAH, "Optimum Allocation of Quantiles in Disjoint Intervals
for the Blues of the Parameters of Exponential Distribution when the Sample is Censored in the
Middle," Vol. 17, No. 3, Sept. 1970, pp. 331-349.
SCHAFER, R. E. and N. D. SINGPURWALLA, "A Sequential Bayes Procedure for Reliability Demon-
stration," Vol. 17, No. 1, Mar. 1970, pp. 55-67.
SCHRADY, D. A., "Operational Definitions of Inventory Record Accuracy," Vol. 17, No. 1, Mar.
1970, pp. 133-142.
SCOTT, M., "A Queueing Process with Varying Degree of Service," Vol. 17, No. 4, Dec. 1970. pp. 515—
523.
SINGPURWALLA, N. D. and R. E. SCHAFER, "A Sequential Bayes Procedure for Reliability Demon-
stration," Vol. 17, No. 1, Mar. 1970, pp. 55-67.
SITGREAVES, R. and S. E. HABER, "A Methodology for Estimating Expected Usage of Repair
Parts with Application to Parts with No Usage History," Vol. 17, No. 4, Dec. 1970, pp. 535-546.
SMITH, R. A., J. E. CREMEANS, and G. R. TYNDALL, "Optimal Multicommodity Network Flows
with Resource Allocation," Vol. 17, No. 3, Sept. 1970, pp. 269-279.
SPIVEY, W. A. and H. TAMURA, "Goal Programming in Econometrics," Vol. 17, No. 2, June 1970.
pp. 183-192.
564
STEINBERG. D. I., "The Fixed Charge Problem," Vol. 17, No. 2, June 1970, pp. 217-235.
STERNLIGHT, D., "The Fast Deployment Logistic Ship Project: Economic Design and Decision
Techniques," Vol. 17, No. 3, Sept. 1970, pp. 373-387.
TAMURA, H. and W. A. SPIVEY, "Goal Programming in Econometrics," Vol. 17, No. 2, June 1970,
pp. 183-192.
TREPP, R. C. and V. J. PRESUTTI, JR., "More Ado About Economic Order Quantities (EOQ),"
Vol. 17, No. 2, June 1970, pp. 243-251.
TYNDALL, G. R., J. E. CREMEANS, and R. A. SMITH, "Optimal Multicommodity Network Flows
with Resource Allocation," Vol. 17, No. 3, Sept. 1970, pp. 269-279.
VON LANZENAUER, C. H., "Production and Employment Scheduling in Multistage Production
Systems," Vol. 17, No. 2, June 1970, pp. 193-198.
WOLLMER, R. D., "Interception in a Network," Vol. 17, No. 2, June 1970, pp. 207-216.
YASUDA, Y., "A Note on the Core of a Cooperative Game without Side Payment," Vol. 17, No. 1,
Mar. 1970, pp. 143-149.
ZACKS, S., "A Two-Echelon Multi-Station Inventory Model for Navy Applications," Vol. 17, No. 1,
Mar. 1970, pp. 79-85.
ZWART, P. B., "Nonlinear Programming— The Choice of Direction by Gradient Projection," Vol. 17,
No. 4, Dec. 1970, pp. 431-438.
O U. S. GOVERNMENT PRINTING OFFICE : 1971 433-702/3
INFORMATION FOR CONTRIBUTORS
The NAVAL RESEARCH LOGISTICS QUARTERLY is devoted to the dissemination of
scientific information in logistics and will publish research and expository papers, including those
in certain areas of mathematics, statistics, and economics, relevant to the over-all effort to improve
the efficiency and effectiveness of logistics operations.
Manuscripts and other items for publication should be sent to The Managing Editor, NAVAL
RESEARCH LOGISTICS QUARTERLY, Office of Naval Research, Arlington, Va. 22217.
Each manuscript which is considered to be suitable material tor the QUARTERLY is sent to one
or more referees.
Manuscripts submitted for publication should be typewritten, double-spaced, and the author
should retain a copy. Refereeing may be expedited if an extra copy of the manuscript is submitted
with the original.
A short abstract (not over 400 words) should accompany each manuscript. This will appear
at the head of the published paper in the QUARTERLY.
There is no authorization for compensation to authors for papers which have been accepted
for publication. Authors will receive 250 reprints of their published papers.
Readers are invited to submit to the Managing Editor items of general interest in the field
of logistics, for possible publication in the NEWS AND MEMORANDA or NOTES sections
of the QUARTERLY.
NAVAL RESEARCH DECEMBER 19
LOGISTICS VOL. 17, NO. i
QUARTERLY NAVSO P-1278
CONTENTS
ARTICLES Pa
A Generalized Upper Bounding Method for Doubly Coupled Linear Programs by J. K. 4]
Hartman and L. S. Lasdon
Nonlinear Programming — The Choice of Direction by Gradient Projection by P. B. Zwart 42
Simple Stochastic Networks: Some Problems and Procedures by J. M. Burt, Jr., D. P. 43
Gaver and M. Perlas
A Network Isolation Algorithm by M. Bellmore, G. Bennington and S. Lubore 4(
Resource Allocation for Transportation by G. Bennington and S. Lubore 4'
Interval Estimation of the Normal Mean Subject to Restrictions, when the Variance Is 48
Known by S. Blumenthal
The Value Statement by M. E. Nightengale 5C
A Queueing Process with Varying Degree of Service by M. Scott 51
MAD: Mathematical Analysis of Downtime by E. B. Brandt and D. R. Limaye 52
A Methodology for Estimating Expected Usage of Repair Parts with Application to Parts 53
with No Usage History by S. E. Haber and R. Sitgreaves
A Stochastic Constrained Optimal Replacement Model by P. J. Kalman 54
A Note on the Calculation of Expected Time-Weighted Backorders over a Given Interval 55
by C. Ho
OFFICE OF NAVAL RESEARCH
Arlington, Va. 22217